|
# π Next Sentence Prediction Model |
|
|
|
This repository hosts a fine-tuned **Flan-T5-Base** model optimized for **next-line prediction**. The model is trained to predict the next sentence in a given text, making it useful for applications like text completion, conversation modeling, and document coherence assessment. |
|
|
|
## π Model Details |
|
|
|
- **Model Architecture**: Flan-T5-Base |
|
- **Task**: Next Sentence Prediction |
|
- **Dataset**: OpenWebText-10k (Preprocessed) |
|
- **Fine-tuning Framework**: Hugging Face Transformers |
|
- **Quantization**: FP16 for efficiency |
|
|
|
## π Usage |
|
|
|
### Installation |
|
|
|
```bash |
|
pip install transformers torch datasets |
|
``` |
|
|
|
### Loading the Model |
|
|
|
```python |
|
from transformers import T5ForConditionalGeneration, T5Tokenizer |
|
import torch |
|
|
|
device = "cuda" if torch.cuda.is_available() else "cpu" |
|
|
|
model_name = "AventIQ-AI/flan-t5-base-next-line-prediction" |
|
model = T5ForConditionalGeneration.from_pretrained(model_name).to(device) |
|
tokenizer = T5Tokenizer.from_pretrained(model_name) |
|
``` |
|
|
|
### Perform Next Sentence Prediction |
|
|
|
```python |
|
def predict_next_sentence(model, tokenizer, input_sentence): |
|
device = "cuda" if torch.cuda.is_available() else "cpu" |
|
formatted_text = f"predict next line: {input_sentence}" |
|
input_ids = tokenizer(formatted_text, return_tensors="pt").input_ids.to(device) |
|
|
|
with torch.no_grad(): |
|
output_ids = model.generate(input_ids, max_length=50) |
|
|
|
return tokenizer.decode(output_ids[0], skip_special_tokens=True) |
|
|
|
# πΉ **Test Prediction** |
|
input_sentence = "The sun was setting behind the mountains." |
|
predicted_sentence = predict_next_sentence(model, tokenizer, input_sentence) |
|
|
|
print(f"Input Sentence: {input_sentence}") |
|
print(f"Predicted Next Sentence: {predicted_sentence}") |
|
``` |
|
|
|
## π Evaluation Results |
|
|
|
After fine-tuning, the model was evaluated on a test set, achieving the following performance: |
|
|
|
| Metric | Score | Meaning | |
|
| ------------------- | ------ | ----------------------------------- | |
|
| **Perplexity** | 23 | Measures model confidence | |
|
| **Inference Speed** | Fast | Optimized for real-time completion | |
|
|
|
## π§ Fine-Tuning Details |
|
|
|
### Dataset |
|
|
|
The model was trained using the **OpenWebText-10k dataset**, containing **10,000 documents**. The dataset was preprocessed by splitting texts into sentence pairs, where the model learns to predict the next logical sentence. |
|
|
|
### Training Configuration |
|
|
|
- **Number of epochs**: 3 |
|
- **Batch size**: 8 |
|
- **Optimizer**: AdamW |
|
- **Learning rate**: 2e-5 |
|
- **Evaluation strategy**: Epoch-based |
|
|
|
### Quantization |
|
|
|
The model was quantized using **FP16**, reducing memory usage while maintaining performance. |
|
|
|
## π Repository Structure |
|
|
|
```bash |
|
. |
|
βββ model/ # Fine-tuned model files |
|
βββ tokenizer_config/ # Tokenizer configuration |
|
βββ quantized_model/ # FP16 quantized model |
|
βββ README.md # Model documentation |
|
``` |
|
|
|
## β οΈ Limitations |
|
|
|
- The model works best with **well-structured sentences**. |
|
- May struggle with **long-range dependencies** in texts. |
|
- **Contextual consistency** is limited to sentence pairs. |
|
|
|
## π¬ Contact & Contributions |
|
|
|
For improvements or questions, feel free to open an issue or contribute to this repository! |
|
|
|
|