Update README.md
Browse files
README.md
CHANGED
@@ -6,4 +6,52 @@ language:
|
|
6 |
- en
|
7 |
base_model:
|
8 |
- google/flan-t5-small
|
9 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
6 |
- en
|
7 |
base_model:
|
8 |
- google/flan-t5-small
|
9 |
+
tags:
|
10 |
+
- summarization
|
11 |
+
- research-papers
|
12 |
+
- arxiv
|
13 |
+
- t5
|
14 |
+
---
|
15 |
+
|
16 |
+
# Fine-Tuned Summarization Model (`fine-tuned-summarization-arxiv`)
|
17 |
+
This model is a fine-tuned version of [`google/flan-t5-small`](https://huggingface.co/google/flan-t5-small) on a dataset of armanc/scientific_papers (arxiv). It is optimized for **summarizing scientific abstracts**.
|
18 |
+
|
19 |
+
## Model Details
|
20 |
+
- **Base Model:** `google/flan-t5-small`
|
21 |
+
- **Training Data:** Arxiv Research Papers (`article` → `abstract`)
|
22 |
+
- **Fine-Tuned Task:** Text Summarization
|
23 |
+
- **Use Case:** Generate shorter summaries of long research papers
|
24 |
+
- **License:** Apache 2.0
|
25 |
+
|
26 |
+
## How to Use
|
27 |
+
```python
|
28 |
+
from transformers import T5ForConditionalGeneration, T5Tokenizer
|
29 |
+
|
30 |
+
model = T5ForConditionalGeneration.from_pretrained("Talina06/arxiv-summarization")
|
31 |
+
tokenizer = T5Tokenizer.from_pretrained("Talina06/arxiv-summarization")
|
32 |
+
|
33 |
+
text = "Summarize: Deep learning is being used to advance medical research, particularly in cancer detection."
|
34 |
+
inputs = tokenizer(text, return_tensors="pt")
|
35 |
+
summary_ids = model.generate(**inputs)
|
36 |
+
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
|
37 |
+
|
38 |
+
print("Generated Summary:", summary)
|
39 |
+
```
|
40 |
+
|
41 |
+
## Training Details
|
42 |
+
- **Training Data:** 100k+ Arxiv research papers
|
43 |
+
- **Training Framework:** Hugging Face Transformers
|
44 |
+
- **Hyperparameters:**
|
45 |
+
- Learning Rate: `5e-5`
|
46 |
+
- Batch Size: `8`
|
47 |
+
- Epochs: `10`
|
48 |
+
- **Hardware Used:** TPU & GPU
|
49 |
+
|
50 |
+
## Limitations
|
51 |
+
- ❌ May struggle with **very technical** papers (e.g., complex math formulas).
|
52 |
+
|
53 |
+
## Example Summaries
|
54 |
+
| **Original Abstract** | **Generated Summary** |
|
55 |
+
|----------------------|----------------------|
|
56 |
+
| "Deep learning has transformed many fields... We propose a new CNN for cancer detection..." | "A CNN model is proposed for cancer detection using deep learning." |
|
57 |
+
| "Quantum computing has shown potential for cryptographic applications..." | "Quantum computing can be used in cryptography." |
|