Update README.md
Browse files
README.md
CHANGED
@@ -1,12 +1,12 @@
|
|
1 |
-
#
|
2 |
|
3 |
-
This repository hosts a quantized version of the
|
4 |
|
5 |
## Model Details
|
6 |
|
7 |
-
- **Model Architecture:**
|
8 |
- **Task:** Text Summarization for Educational Books
|
9 |
-
- **Dataset:**
|
10 |
- **Quantization:** Float16
|
11 |
- **Fine-tuning Framework:** Hugging Face Transformers
|
12 |
|
@@ -18,65 +18,62 @@ This repository hosts a quantized version of the BERT model, fine-tuned for stoc
|
|
18 |
pip install transformers torch
|
19 |
```
|
20 |
|
21 |
-
|
22 |
### Loading the Model
|
23 |
|
24 |
```python
|
25 |
-
|
26 |
-
from transformers import BertForSequenceClassification, BertTokenizer
|
27 |
import torch
|
28 |
|
29 |
-
|
30 |
-
quantized_model_path = "AventIQ-AI/text-summarization-for-educational-books"
|
31 |
-
quantized_model = BertForSequenceClassification.from_pretrained(quantized_model_path)
|
32 |
-
quantized_model.eval() # Set to evaluation mode
|
33 |
-
quantized_model.half() # Convert model to FP16
|
34 |
-
|
35 |
-
# Load tokenizer
|
36 |
-
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
|
37 |
-
|
38 |
-
# Define a test sentence
|
39 |
-
test_sentence = "Photosynthesis is the process by which green plants and some other organisms use sunlight to synthesize foods with the help of chlorophyll pigments. The process primarily occurs in the chloroplasts of plant cells. During photosynthesis, plants take in carbon dioxide (COβ) from the atmosphere and water (HβO) from the soil. These are converted into glucose (CβHββOβ) and oxygen (Oβ) under the influence of sunlight. The overall chemical reaction can be summarized as: 6COβ + 6HβO + light energy β CβHββOβ + 6Oβ. This process is crucial not only because it provides food for the plant itself but also because it produces oxygen, which is essential for the survival of most living organisms on Earth. Additionally, it forms the basis of the food chain in almost all ecosystems."
|
40 |
-
|
41 |
-
# Tokenize input
|
42 |
-
inputs = tokenizer(test_sentence, return_tensors="pt", padding=True, truncation=True, max_length=128)
|
43 |
-
|
44 |
-
# Ensure input tensors are in correct dtype
|
45 |
-
inputs["input_ids"] = inputs["input_ids"].long() # Convert to long type
|
46 |
-
inputs["attention_mask"] = inputs["attention_mask"].long() # Convert to long type
|
47 |
-
|
48 |
-
# Make prediction
|
49 |
-
with torch.no_grad():
|
50 |
-
outputs = quantized_model(**inputs)
|
51 |
|
52 |
-
|
53 |
-
|
54 |
-
|
55 |
|
|
|
|
|
|
|
|
|
56 |
|
57 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
58 |
|
59 |
-
|
60 |
-
|
61 |
|
|
|
|
|
62 |
```
|
63 |
|
64 |
-
|
|
|
|
|
65 |
|
66 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
67 |
|
68 |
## Fine-Tuning Details
|
69 |
|
70 |
### Dataset
|
71 |
|
72 |
-
The dataset
|
73 |
|
74 |
### Training
|
75 |
|
76 |
-
- Number of epochs: 3
|
77 |
-
- Batch size:
|
78 |
- Evaluation strategy: epoch
|
79 |
-
- Learning rate:
|
80 |
|
81 |
### Quantization
|
82 |
|
@@ -88,7 +85,7 @@ Post-training quantization was applied using PyTorch's built-in quantization fra
|
|
88 |
.
|
89 |
βββ model/ # Contains the quantized model files
|
90 |
βββ tokenizer_config/ # Tokenizer configuration and vocabulary files
|
91 |
-
βββ model.
|
92 |
βββ README.md # Model documentation
|
93 |
```
|
94 |
|
@@ -99,4 +96,4 @@ Post-training quantization was applied using PyTorch's built-in quantization fra
|
|
99 |
|
100 |
## Contributing
|
101 |
|
102 |
-
Contributions are welcome! Feel free to open an issue or submit a pull request if you have suggestions or improvements.
|
|
|
1 |
+
# Text-to-Text Transfer Transformer Quantized Model for Text Summarization for Educational Books
|
2 |
|
3 |
+
This repository hosts a quantized version of the T5 model, fine-tuned for text summarization tasks. The model has been optimized for efficient deployment while maintaining high accuracy, making it suitable for resource-constrained environments.
|
4 |
|
5 |
## Model Details
|
6 |
|
7 |
+
- **Model Architecture:** T5
|
8 |
- **Task:** Text Summarization for Educational Books
|
9 |
+
- **Dataset:** Hugging Face's `cnn_dailymail'
|
10 |
- **Quantization:** Float16
|
11 |
- **Fine-tuning Framework:** Hugging Face Transformers
|
12 |
|
|
|
18 |
pip install transformers torch
|
19 |
```
|
20 |
|
|
|
21 |
### Loading the Model
|
22 |
|
23 |
```python
|
24 |
+
from transformers import T5Tokenizer, T5ForConditionalGeneration
|
|
|
25 |
import torch
|
26 |
|
27 |
+
device = "cuda" if torch.cuda.is_available() else "cpu"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
28 |
|
29 |
+
model_name = "AventIQ-AI/text-summarization-for-educational-books"
|
30 |
+
tokenizer = T5Tokenizer.from_pretrained(model_name)
|
31 |
+
model = T5ForConditionalGeneration.from_pretrained(model_name).to(device)
|
32 |
|
33 |
+
def test_summarization(model, tokenizer):
|
34 |
+
user_text = input("\nEnter your text for summarization:\n")
|
35 |
+
input_text = "summarize: " + user_text
|
36 |
+
inputs = tokenizer(input_text, return_tensors="pt", truncation=True, max_length=512).to(device)
|
37 |
|
38 |
+
output = model.generate(
|
39 |
+
**inputs,
|
40 |
+
max_new_tokens=100,
|
41 |
+
num_beams=5,
|
42 |
+
length_penalty=0.8,
|
43 |
+
early_stopping=True
|
44 |
+
)
|
45 |
|
46 |
+
summary = tokenizer.decode(output[0], skip_special_tokens=True)
|
47 |
+
return summary
|
48 |
|
49 |
+
print("\nπ **Model Summary:**")
|
50 |
+
print(test_summarization(model, tokenizer))
|
51 |
```
|
52 |
|
53 |
+
# π ROUGE Evaluation Results
|
54 |
+
|
55 |
+
After fine-tuning the **T5-Small** model for text summarization, we obtained the following **ROUGE** scores:
|
56 |
|
57 |
+
| **Metric** | **Score** | **Meaning** |
|
58 |
+
|-------------|-----------|-------------|
|
59 |
+
| **ROUGE-1** | **0.3061** (~30%) | Measures overlap of **unigrams (single words)** between the reference and generated summary. |
|
60 |
+
| **ROUGE-2** | **0.1241** (~12%) | Measures overlap of **bigrams (two-word phrases)**, indicating coherence and fluency. |
|
61 |
+
| **ROUGE-L** | **0.2233** (~22%) | Measures **longest matching word sequences**, testing sentence structure preservation. |
|
62 |
+
| **ROUGE-Lsum** | **0.2620** (~26%) | Similar to ROUGE-L but optimized for summarization tasks. |
|
63 |
+
|
64 |
|
65 |
## Fine-Tuning Details
|
66 |
|
67 |
### Dataset
|
68 |
|
69 |
+
The Hugging Face's `cnn_dailymail` dataset was used, containing the text and their summarization examples.
|
70 |
|
71 |
### Training
|
72 |
|
73 |
+
- Number of epochs: 3
|
74 |
+
- Batch size: 4
|
75 |
- Evaluation strategy: epoch
|
76 |
+
- Learning rate: 3e-5
|
77 |
|
78 |
### Quantization
|
79 |
|
|
|
85 |
.
|
86 |
βββ model/ # Contains the quantized model files
|
87 |
βββ tokenizer_config/ # Tokenizer configuration and vocabulary files
|
88 |
+
βββ model.safetensors/ # Quantized Model
|
89 |
βββ README.md # Model documentation
|
90 |
```
|
91 |
|
|
|
96 |
|
97 |
## Contributing
|
98 |
|
99 |
+
Contributions are welcome! Feel free to open an issue or submit a pull request if you have suggestions or improvements.
|