Spaces:
Sleeping
Sleeping

Integrate DeepFashion2 dataset: add evaluation module, utilities, and API endpoints for dataset management and analysis
f8b306b
# DeepFashion2 Dataset Integration | |
This document describes the integration of the DeepFashion2 dataset with the Vestiq fashion analysis system. | |
## Overview | |
DeepFashion2 is a comprehensive fashion dataset that provides: | |
- 491K diverse images of 13 popular clothing categories | |
- Bounding box annotations for fashion items | |
- Dense pose estimation | |
- Commercial-consumer clothes correspondence | |
- Scale, occlusion, zoom-in, and viewpoint labels | |
## Integration Features | |
### 1. Dataset Loading and Processing | |
- **DeepFashion2Dataset**: PyTorch dataset class for loading images and annotations | |
- **Category Mapping**: Maps DeepFashion2 categories to yainage90 model categories | |
- **Data Transforms**: Standard preprocessing for fashion images | |
- **Batch Processing**: Efficient DataLoader implementation | |
### 2. Evaluation Framework | |
- **Detection Accuracy**: Evaluate fashion object detection performance | |
- **Feature Quality**: Assess feature extraction capabilities | |
- **Classification Metrics**: Precision, recall, F1-score, confusion matrix | |
- **Visualization**: Confusion matrix plots and performance charts | |
### 3. API Endpoints | |
- `/deepfashion2/status` - Check integration status and dataset availability | |
- `/deepfashion2/statistics` - Get dataset statistics and category distribution | |
- `/deepfashion2/evaluate` - Run evaluation using DeepFashion2 as benchmark | |
- `/deepfashion2/setup-instructions` - Get setup instructions for the dataset | |
## Category Mapping | |
DeepFashion2 uses 13 detailed categories that are mapped to yainage90's 7 categories: | |
| DeepFashion2 Category | yainage90 Category | | |
|----------------------|-------------------| | |
| short_sleeved_shirt | top | | |
| long_sleeved_shirt | top | | |
| short_sleeved_outwear| outer | | |
| long_sleeved_outwear | outer | | |
| vest | top | | |
| sling | top | | |
| shorts | bottom | | |
| trousers | bottom | | |
| skirt | bottom | | |
| short_sleeved_dress | dress | | |
| long_sleeved_dress | dress | | |
| vest_dress | dress | | |
| sling_dress | dress | | |
## Setup Instructions | |
### 1. Download the Dataset | |
The DeepFashion2 dataset requires manual download due to licensing requirements: | |
1. Visit the official repository: https://github.com/switchablenorms/DeepFashion2 | |
2. Follow the dataset download instructions | |
3. Register and download the dataset files | |
### 2. Dataset Structure | |
Extract the dataset to `./data/deepfashion2/` with the following structure: | |
``` | |
deepfashion2/ | |
βββ train/ | |
β βββ image/ # Training images | |
β βββ annos/ # Training annotations (JSON) | |
βββ validation/ | |
β βββ image/ # Validation images | |
β βββ annos/ # Validation annotations (JSON) | |
βββ test/ | |
βββ image/ # Test images | |
βββ annos/ # Test annotations (JSON) | |
``` | |
### 3. Install Dependencies | |
Install additional dependencies for evaluation: | |
```bash | |
pip install scikit-learn matplotlib seaborn | |
``` | |
### 4. Verify Setup | |
Check the integration status: | |
```bash | |
curl http://localhost:7861/deepfashion2/status | |
``` | |
## Usage Examples | |
### 1. Basic Dataset Loading | |
```python | |
from deepfashion2_utils import DeepFashion2Config, DeepFashion2Dataset | |
config = DeepFashion2Config() | |
dataset = DeepFashion2Dataset( | |
root_dir=config.dataset_root, | |
split='validation', | |
load_annotations=True | |
) | |
# Get a sample | |
sample = dataset[0] | |
print(f"Image: {sample['image_path']}") | |
print(f"Categories: {dataset.get_categories_in_image(sample['annotations'])}") | |
``` | |
### 2. Running Evaluation | |
```python | |
from deepfashion2_evaluation import run_full_evaluation | |
from fast import analyzer | |
# Run evaluation with 100 samples | |
report_path = run_full_evaluation(analyzer, max_samples=100) | |
print(f"Evaluation report saved to: {report_path}") | |
``` | |
### 3. API Usage | |
```bash | |
# Check status | |
curl -X GET "http://localhost:7861/deepfashion2/status" | |
# Get dataset statistics | |
curl -X GET "http://localhost:7861/deepfashion2/statistics" | |
# Run evaluation | |
curl -X POST "http://localhost:7861/deepfashion2/evaluate?max_samples=50" | |
# Get setup instructions | |
curl -X GET "http://localhost:7861/deepfashion2/setup-instructions" | |
``` | |
## Evaluation Metrics | |
### Detection Accuracy | |
- **Category-level accuracy**: How well the model detects clothing categories | |
- **Detection score**: IoU-like metric for category overlap | |
- **Confusion matrix**: Detailed breakdown of predictions vs ground truth | |
### Feature Quality | |
- **Feature dimension**: Dimensionality of extracted features | |
- **Intra-category similarity**: How similar features are within the same category | |
- **Inter-category distance**: How well features separate different categories | |
- **Feature separability**: Overall quality metric for feature discrimination | |
## Configuration Options | |
### DeepFashion2Config | |
```python | |
@dataclass | |
class DeepFashion2Config: | |
dataset_root: str = "./data/deepfashion2" | |
categories: List[str] = None # Auto-populated with 13 categories | |
image_size: Tuple[int, int] = (224, 224) | |
batch_size: int = 32 | |
num_workers: int = 4 | |
``` | |
### Customization | |
You can customize the configuration for your specific needs: | |
```python | |
config = DeepFashion2Config( | |
dataset_root="/path/to/your/deepfashion2", | |
image_size=(256, 256), | |
batch_size=16 | |
) | |
``` | |
## Performance Considerations | |
### Memory Usage | |
- The dataset is large (~15GB), ensure sufficient disk space | |
- Use appropriate batch sizes based on available GPU memory | |
- Consider using `num_workers` for faster data loading | |
### CPU Optimization | |
- The system automatically detects CPU vs GPU and optimizes accordingly | |
- CPU inference uses float32 precision and limited threads | |
- GPU inference uses float16 precision for better performance | |
### Evaluation Speed | |
- Limit `max_samples` for faster evaluation during development | |
- Full evaluation on the entire validation set may take significant time | |
- Consider running evaluations on a subset for quick feedback | |
## Troubleshooting | |
### Common Issues | |
1. **Dataset not found**: Ensure the dataset is extracted to the correct path | |
2. **Permission errors**: Check file permissions for the dataset directory | |
3. **Memory errors**: Reduce batch size or number of workers | |
4. **Import errors**: Install missing dependencies (scikit-learn, matplotlib, seaborn) | |
### Debug Mode | |
Enable debug logging to troubleshoot issues: | |
```python | |
import logging | |
logging.basicConfig(level=logging.DEBUG) | |
``` | |
## Future Enhancements | |
### Planned Features | |
- **Training Pipeline**: Fine-tune models on DeepFashion2 data | |
- **Advanced Metrics**: Add more sophisticated evaluation metrics | |
- **Visualization Tools**: Enhanced plotting and analysis tools | |
- **Benchmark Comparisons**: Compare against other fashion datasets | |
### Contributing | |
To contribute to the DeepFashion2 integration: | |
1. Fork the repository | |
2. Create a feature branch | |
3. Add tests for new functionality | |
4. Submit a pull request | |
## References | |
- [DeepFashion2 Paper](https://arxiv.org/abs/1901.07973) | |
- [DeepFashion2 Repository](https://github.com/switchablenorms/DeepFashion2) | |
- [yainage90 Models](https://huggingface.co/yainage90) | |
## License | |
This integration follows the same license as the main Vestiq project. The DeepFashion2 dataset has its own licensing terms that must be respected. | |