File size: 3,797 Bytes
6e9666f a7a1c0f 4bead34 a7a1c0f 39d5a87 78ee727 5f7b4c4 8d863df 39d5a87 143a450 39d5a87 5f7b4c4 78ee727 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 |
---
license: apache-2.0
language:
- en
base_model:
- google/siglip2-base-patch16-256
pipeline_tag: image-classification
library_name: transformers
tags:
- siglip2
- '256'
- patch16
- adult-content-detection
- explicit-content-detection
---

# **siglip2-x256-explicit-content**
> **siglip2-x256-explicit-content** is a vision-language encoder model fine-tuned from **siglip2-base-patch16-256** for **multi-class image classification**. Built on the **SiglipForImageClassification** architecture, the model is trained to identify and categorize content types in images, especially for **explicit, suggestive, or safe media filtering**.
> [!note]
*SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features* https://arxiv.org/pdf/2502.14786
```py
Classification Report:
precision recall f1-score support
Anime Picture 0.8940 0.8718 0.8827 5600
Hentai 0.8961 0.8935 0.8948 4180
Normal 0.9100 0.8895 0.8997 5503
Pornography 0.9496 0.9654 0.9574 5600
Enticing or Sensual 0.9132 0.9429 0.9278 5600
accuracy 0.9137 26483
macro avg 0.9126 0.9126 0.9125 26483
weighted avg 0.9135 0.9137 0.9135 26483
```

---
## **Label Space: 5 Classes**
The model classifies each image into one of the following content categories:
```
Class 0: "Anime Picture"
Class 1: "Hentai"
Class 2: "Normal"
Class 3: "Pornography"
Class 4: "Enticing or Sensual"
```
---
## **Install Dependencies**
```bash
pip install -q transformers torch pillow gradio
```
---
## **Inference Code**
```python
import gradio as gr
from transformers import AutoImageProcessor, SiglipForImageClassification
from PIL import Image
import torch
# Load model and processor
model_name = "prithivMLmods/siglip2-x256-explicit-content" # Replace with your model path if needed
model = SiglipForImageClassification.from_pretrained(model_name)
processor = AutoImageProcessor.from_pretrained(model_name)
# ID to Label mapping
id2label = {
"0": "Anime Picture",
"1": "Hentai",
"2": "Normal",
"3": "Pornography",
"4": "Enticing or Sensual"
}
def classify_explicit_content(image):
image = Image.fromarray(image).convert("RGB")
inputs = processor(images=image, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
logits = outputs.logits
probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist()
prediction = {
id2label[str(i)]: round(probs[i], 3) for i in range(len(probs))
}
return prediction
# Gradio Interface
iface = gr.Interface(
fn=classify_explicit_content,
inputs=gr.Image(type="numpy"),
outputs=gr.Label(num_top_classes=5, label="Predicted Content Type"),
title="siglip2-x256-explicit-content",
description="Classifies images into explicit, suggestive, or safe categories (e.g., Hentai, Pornography, Normal)."
)
if __name__ == "__main__":
iface.launch()
```
---
## **Intended Use**
This model is intended for applications such as:
- **Content Moderation**: Automatically detect NSFW or suggestive content.
- **Parental Controls**: Enable AI-based filtering for safe media browsing.
- **Dataset Preprocessing**: Clean and categorize image datasets for research or deployment.
- **Online Platforms**: Help enforce content guidelines for uploads and user-generated media. |