File size: 1,381 Bytes
9084772
d40b76a
 
 
9084772
d40b76a
 
 
 
 
 
 
 
9084772
c839dce
d40b76a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c839dce
d40b76a
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
---
license: apache-2.0
library_name: timm
pipeline_tag: image-classification
tags:
  - image-classification
  - ai-detection
  - vit
datasets:
  - your-username/ai-generated-vs-real
metrics:
  - accuracy
  - f1
---

# AI Source Detector (ViT-Base)

Detects *and* classifies the source of AI-generated images into **five** classes  
(`stable_diffusion`, `midjourney`, `dalle`, `real`, `other_ai`).

## Model Details
* **Architecture:** ViT-Base Patch-16 × 224  
* **Parameters:** 86 M  
* **Fine-tuning epochs:** 10  
* **Optimizer:** AdamW (lr = 3e-5, wd = 0.01)  
* **Hardware:** 1× NVIDIA RTX 4090 (24 GB)

## Training Data
| Class | Images |
|-------|-------:|
| Stable Diffusion | 12 000 |
| Midjourney | 10 500 |
| DALL-E 3 | 9 400 |
| Real | 11 800 |
| Other AI | 8 200 |

Total ≈ 52 k images - 80 % train / 10 % val / 10 % test.

## Evaluation
| Metric | Top-1 | Macro F1 |
|--------|------:|---------:|
| Validation | 92.8 % | 0.928 |
| Test | 91.6 % | 0.914 |

<details>
<summary>Confusion Matrix (click to open)</summary>
<img src="confusion_matrix.png" width="480">
</details>

## Usage
```python
from transformers import ViTImageProcessor, ViTForImageClassification, pipeline
classifier = pipeline(
    task="image-classification",
    model="yaya36095/ai-source-detector",
    top_k=1
)
classifier("demo.jpg")
# → [{'label': 'stable_diffusion', 'score': 0.97}]