developerPushkal commited on
Commit
28c40a3
Β·
verified Β·
1 Parent(s): 231f612

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +101 -0
README.md ADDED
@@ -0,0 +1,101 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # πŸ“ Next Sentence Prediction Model
2
+
3
+ This repository hosts a fine-tuned **Flan-T5-Base** model optimized for **next-line prediction**. The model is trained to predict the next sentence in a given text, making it useful for applications like text completion, conversation modeling, and document coherence assessment.
4
+
5
+ ## πŸ“Œ Model Details
6
+
7
+ - **Model Architecture**: Flan-T5-Base
8
+ - **Task**: Next Sentence Prediction
9
+ - **Dataset**: OpenWebText-10k (Preprocessed)
10
+ - **Fine-tuning Framework**: Hugging Face Transformers
11
+ - **Quantization**: FP16 for efficiency
12
+
13
+ ## πŸš€ Usage
14
+
15
+ ### Installation
16
+
17
+ ```bash
18
+ pip install transformers torch datasets
19
+ ```
20
+
21
+ ### Loading the Model
22
+
23
+ ```python
24
+ from transformers import T5ForConditionalGeneration, T5Tokenizer
25
+ import torch
26
+
27
+ device = "cuda" if torch.cuda.is_available() else "cpu"
28
+
29
+ model_name = "AventIQ-AI/flan-t5-base-next-line-prediction"
30
+ model = T5ForConditionalGeneration.from_pretrained(model_name).to(device)
31
+ tokenizer = T5Tokenizer.from_pretrained(model_name)
32
+ ```
33
+
34
+ ### Perform Next Sentence Prediction
35
+
36
+ ```python
37
+ def predict_next_sentence(model, tokenizer, input_sentence):
38
+ device = "cuda" if torch.cuda.is_available() else "cpu"
39
+ formatted_text = f"predict next line: {input_sentence}"
40
+ input_ids = tokenizer(formatted_text, return_tensors="pt").input_ids.to(device)
41
+
42
+ with torch.no_grad():
43
+ output_ids = model.generate(input_ids, max_length=50)
44
+
45
+ return tokenizer.decode(output_ids[0], skip_special_tokens=True)
46
+
47
+ # πŸ”Ή **Test Prediction**
48
+ input_sentence = "The sun was setting behind the mountains."
49
+ predicted_sentence = predict_next_sentence(model, tokenizer, input_sentence)
50
+
51
+ print(f"Input Sentence: {input_sentence}")
52
+ print(f"Predicted Next Sentence: {predicted_sentence}")
53
+ ```
54
+
55
+ ## πŸ“Š Evaluation Results
56
+
57
+ After fine-tuning, the model was evaluated on a test set, achieving the following performance:
58
+
59
+ | Metric | Score | Meaning |
60
+ | ------------------- | ------ | ----------------------------------- |
61
+ | **Perplexity** | 23 | Measures model confidence |
62
+ | **Inference Speed** | Fast | Optimized for real-time completion |
63
+
64
+ ## πŸ”§ Fine-Tuning Details
65
+
66
+ ### Dataset
67
+
68
+ The model was trained using the **OpenWebText-10k dataset**, containing **10,000 documents**. The dataset was preprocessed by splitting texts into sentence pairs, where the model learns to predict the next logical sentence.
69
+
70
+ ### Training Configuration
71
+
72
+ - **Number of epochs**: 3
73
+ - **Batch size**: 8
74
+ - **Optimizer**: AdamW
75
+ - **Learning rate**: 2e-5
76
+ - **Evaluation strategy**: Epoch-based
77
+
78
+ ### Quantization
79
+
80
+ The model was quantized using **FP16**, reducing memory usage while maintaining performance.
81
+
82
+ ## πŸ“‚ Repository Structure
83
+
84
+ ```bash
85
+ .
86
+ β”œβ”€β”€ model/ # Fine-tuned model files
87
+ β”œβ”€β”€ tokenizer_config/ # Tokenizer configuration
88
+ β”œβ”€β”€ quantized_model/ # FP16 quantized model
89
+ β”œβ”€β”€ README.md # Model documentation
90
+ ```
91
+
92
+ ## ⚠️ Limitations
93
+
94
+ - The model works best with **well-structured sentences**.
95
+ - May struggle with **long-range dependencies** in texts.
96
+ - **Contextual consistency** is limited to sentence pairs.
97
+
98
+ ## πŸ“¬ Contact & Contributions
99
+
100
+ For improvements or questions, feel free to open an issue or contribute to this repository!
101
+