Spaces:

Hashii1729
/

Vestiq

Sleeping

File size: 7,451 Bytes

f8b306b

# DeepFashion2 Dataset Integration

This document describes the integration of the DeepFashion2 dataset with the Vestiq fashion analysis system.

## Overview

DeepFashion2 is a comprehensive fashion dataset that provides:
- 491K diverse images of 13 popular clothing categories
- Bounding box annotations for fashion items
- Dense pose estimation
- Commercial-consumer clothes correspondence
- Scale, occlusion, zoom-in, and viewpoint labels

## Integration Features

### 1. Dataset Loading and Processing
- **DeepFashion2Dataset**: PyTorch dataset class for loading images and annotations
- **Category Mapping**: Maps DeepFashion2 categories to yainage90 model categories
- **Data Transforms**: Standard preprocessing for fashion images
- **Batch Processing**: Efficient DataLoader implementation

### 2. Evaluation Framework
- **Detection Accuracy**: Evaluate fashion object detection performance
- **Feature Quality**: Assess feature extraction capabilities
- **Classification Metrics**: Precision, recall, F1-score, confusion matrix
- **Visualization**: Confusion matrix plots and performance charts

### 3. API Endpoints
- `/deepfashion2/status` - Check integration status and dataset availability
- `/deepfashion2/statistics` - Get dataset statistics and category distribution
- `/deepfashion2/evaluate` - Run evaluation using DeepFashion2 as benchmark
- `/deepfashion2/setup-instructions` - Get setup instructions for the dataset

## Category Mapping

DeepFashion2 uses 13 detailed categories that are mapped to yainage90's 7 categories:

| DeepFashion2 Category | yainage90 Category |
|----------------------|-------------------|
| short_sleeved_shirt  | top              |
| long_sleeved_shirt   | top              |
| short_sleeved_outwear| outer            |
| long_sleeved_outwear | outer            |
| vest                 | top              |
| sling                | top              |
| shorts               | bottom           |
| trousers             | bottom           |
| skirt                | bottom           |
| short_sleeved_dress  | dress            |
| long_sleeved_dress   | dress            |
| vest_dress           | dress            |
| sling_dress          | dress            |

## Setup Instructions

### 1. Download the Dataset

The DeepFashion2 dataset requires manual download due to licensing requirements:

1. Visit the official repository: https://github.com/switchablenorms/DeepFashion2
2. Follow the dataset download instructions
3. Register and download the dataset files

### 2. Dataset Structure

Extract the dataset to `./data/deepfashion2/` with the following structure:

```
deepfashion2/
├── train/
│   ├── image/          # Training images
│   └── annos/          # Training annotations (JSON)
├── validation/
│   ├── image/          # Validation images
│   └── annos/          # Validation annotations (JSON)
└── test/
    ├── image/          # Test images
    └── annos/          # Test annotations (JSON)
```

### 3. Install Dependencies

Install additional dependencies for evaluation:

```bash
pip install scikit-learn matplotlib seaborn
```

### 4. Verify Setup

Check the integration status:

```bash
curl http://localhost:7861/deepfashion2/status
```

## Usage Examples

### 1. Basic Dataset Loading

```python
from deepfashion2_utils import DeepFashion2Config, DeepFashion2Dataset

config = DeepFashion2Config()
dataset = DeepFashion2Dataset(
    root_dir=config.dataset_root,
    split='validation',
    load_annotations=True
)

# Get a sample
sample = dataset[0]
print(f"Image: {sample['image_path']}")
print(f"Categories: {dataset.get_categories_in_image(sample['annotations'])}")
```

### 2. Running Evaluation

```python
from deepfashion2_evaluation import run_full_evaluation
from fast import analyzer

# Run evaluation with 100 samples
report_path = run_full_evaluation(analyzer, max_samples=100)
print(f"Evaluation report saved to: {report_path}")
```

### 3. API Usage

```bash
# Check status
curl -X GET "http://localhost:7861/deepfashion2/status"

# Get dataset statistics
curl -X GET "http://localhost:7861/deepfashion2/statistics"

# Run evaluation
curl -X POST "http://localhost:7861/deepfashion2/evaluate?max_samples=50"

# Get setup instructions
curl -X GET "http://localhost:7861/deepfashion2/setup-instructions"
```

## Evaluation Metrics

### Detection Accuracy
- **Category-level accuracy**: How well the model detects clothing categories
- **Detection score**: IoU-like metric for category overlap
- **Confusion matrix**: Detailed breakdown of predictions vs ground truth

### Feature Quality
- **Feature dimension**: Dimensionality of extracted features
- **Intra-category similarity**: How similar features are within the same category
- **Inter-category distance**: How well features separate different categories
- **Feature separability**: Overall quality metric for feature discrimination

## Configuration Options

### DeepFashion2Config

```python
@dataclass
class DeepFashion2Config:
    dataset_root: str = "./data/deepfashion2"
    categories: List[str] = None  # Auto-populated with 13 categories
    image_size: Tuple[int, int] = (224, 224)
    batch_size: int = 32
    num_workers: int = 4
```

### Customization

You can customize the configuration for your specific needs:

```python
config = DeepFashion2Config(
    dataset_root="/path/to/your/deepfashion2",
    image_size=(256, 256),
    batch_size=16
)
```

## Performance Considerations

### Memory Usage
- The dataset is large (~15GB), ensure sufficient disk space
- Use appropriate batch sizes based on available GPU memory
- Consider using `num_workers` for faster data loading

### CPU Optimization
- The system automatically detects CPU vs GPU and optimizes accordingly
- CPU inference uses float32 precision and limited threads
- GPU inference uses float16 precision for better performance

### Evaluation Speed
- Limit `max_samples` for faster evaluation during development
- Full evaluation on the entire validation set may take significant time
- Consider running evaluations on a subset for quick feedback

## Troubleshooting

### Common Issues

1. **Dataset not found**: Ensure the dataset is extracted to the correct path
2. **Permission errors**: Check file permissions for the dataset directory
3. **Memory errors**: Reduce batch size or number of workers
4. **Import errors**: Install missing dependencies (scikit-learn, matplotlib, seaborn)

### Debug Mode

Enable debug logging to troubleshoot issues:

```python
import logging
logging.basicConfig(level=logging.DEBUG)
```

## Future Enhancements

### Planned Features
- **Training Pipeline**: Fine-tune models on DeepFashion2 data
- **Advanced Metrics**: Add more sophisticated evaluation metrics
- **Visualization Tools**: Enhanced plotting and analysis tools
- **Benchmark Comparisons**: Compare against other fashion datasets

### Contributing

To contribute to the DeepFashion2 integration:

1. Fork the repository
2. Create a feature branch
3. Add tests for new functionality
4. Submit a pull request

## References

- [DeepFashion2 Paper](https://arxiv.org/abs/1901.07973)
- [DeepFashion2 Repository](https://github.com/switchablenorms/DeepFashion2)
- [yainage90 Models](https://huggingface.co/yainage90)

## License

This integration follows the same license as the main Vestiq project. The DeepFashion2 dataset has its own licensing terms that must be respected.