File size: 1,984 Bytes
24ea657
 
 
 
 
 
 
 
70adb3d
 
 
 
 
 
 
a63262d
70adb3d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
---
license: apache-2.0
datasets:
- armanc/scientific_papers
language:
- en
base_model:
- google/flan-t5-small
tags:
  - summarization
  - research-papers
  - arxiv
  - t5
---

# arxiv-summarization
This model is a fine-tuned version of [`google/flan-t5-small`](https://huggingface.co/google/flan-t5-small) on a dataset of armanc/scientific_papers (arxiv). It is optimized for **summarizing scientific abstracts**.

## Model Details
- **Base Model:** `google/flan-t5-small`
- **Training Data:** Arxiv Research Papers (`article` → `abstract`)
- **Fine-Tuned Task:** Text Summarization
- **Use Case:** Generate shorter summaries of long research papers
- **License:** Apache 2.0

## How to Use
```python
from transformers import T5ForConditionalGeneration, T5Tokenizer

model = T5ForConditionalGeneration.from_pretrained("Talina06/arxiv-summarization")
tokenizer = T5Tokenizer.from_pretrained("Talina06/arxiv-summarization")

text = "Summarize: Deep learning is being used to advance medical research, particularly in cancer detection."
inputs = tokenizer(text, return_tensors="pt")
summary_ids = model.generate(**inputs)
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)

print("Generated Summary:", summary)
```

## Training Details
- **Training Data:** 100k+ Arxiv research papers
- **Training Framework:** Hugging Face Transformers
- **Hyperparameters:**
  - Learning Rate: `5e-5`
  - Batch Size: `8`
  - Epochs: `10`
- **Hardware Used:** TPU & GPU

## Limitations
- ❌ May struggle with **very technical** papers (e.g., complex math formulas).

## Example Summaries
| **Original Abstract** | **Generated Summary** |
|----------------------|----------------------|
| "Deep learning has transformed many fields... We propose a new CNN for cancer detection..." | "A CNN model is proposed for cancer detection using deep learning." |
| "Quantum computing has shown potential for cryptographic applications..." | "Quantum computing can be used in cryptography." |