πŸ” SigLIP Person Search - Open Set

This model is a fine-tuned version of google/siglip-base-patch16-224 for open-set person retrieval based on natural language descriptions. It's built to support image-text similarity in real-world retail and surveillance scenarios.

🧠 Use Case

This model allows you to search for people in crowded environments (like malls or stores) using only a text prompt, for example:

"A man wearing a white t-shirt and carrying a brown shoulder bag"

The model will return person crops that match the description.

πŸ’Ύ Training

  • Base: google/siglip-base-patch16-224
  • Loss: Cosine InfoNCE
  • Data: ReID dataset with multimodal attributes (generated via Gemini)
  • Epochs: 10
  • Usage: Retrieval-style search (not classification)

πŸ“ˆ Intended Use

  • Smart surveillance
  • Anonymous retail behavior tracking
  • Human-in-the-loop retrieval
  • Visual search & retrieval systems

πŸ”§ How to Use

from transformers import AutoProcessor, AutoModel
import torch

processor = AutoProcessor.from_pretrained("adonaivera/siglip-person-search-openset")
model = AutoModel.from_pretrained("adonaivera/siglip-person-search-openset")

text = "A man wearing a white t-shirt and carrying a brown shoulder bag"
inputs = processor(text=text, return_tensors="pt")
with torch.no_grad():
    text_features = model.get_text_features(**inputs)

πŸ“Œ Notes

  • This model is optimized for feature extraction and cosine similarity matching
  • It's not meant for classification or image generation
  • Similarity threshold tuning is required depending on your application
Downloads last month
6
Safetensors
Model size
203M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Space using adonaivera/siglip-person-search-openset 1