Spaces:

Hashii1729
/

Vestiq

Sleeping

App Files Files Community

Vestiq / DEEPFASHION2_INTEGRATION.md

Hashii1729

Integrate DeepFashion2 dataset: add evaluation module, utilities, and API endpoints for dataset management and analysis

f8b306b 2 months ago

preview code

raw

history blame contribute delete

7.45 kB

	# DeepFashion2 Dataset Integration

	This document describes the integration of the DeepFashion2 dataset with the Vestiq fashion analysis system.

	## Overview

	DeepFashion2 is a comprehensive fashion dataset that provides:
	- 491K diverse images of 13 popular clothing categories
	- Bounding box annotations for fashion items
	- Dense pose estimation
	- Commercial-consumer clothes correspondence
	- Scale, occlusion, zoom-in, and viewpoint labels

	## Integration Features

	### 1. Dataset Loading and Processing
	- DeepFashion2Dataset: PyTorch dataset class for loading images and annotations
	- Category Mapping: Maps DeepFashion2 categories to yainage90 model categories
	- Data Transforms: Standard preprocessing for fashion images
	- Batch Processing: Efficient DataLoader implementation

	### 2. Evaluation Framework
	- Detection Accuracy: Evaluate fashion object detection performance
	- Feature Quality: Assess feature extraction capabilities
	- Classification Metrics: Precision, recall, F1-score, confusion matrix
	- Visualization: Confusion matrix plots and performance charts

	### 3. API Endpoints
	- `/deepfashion2/status` - Check integration status and dataset availability
	- `/deepfashion2/statistics` - Get dataset statistics and category distribution
	- `/deepfashion2/evaluate` - Run evaluation using DeepFashion2 as benchmark
	- `/deepfashion2/setup-instructions` - Get setup instructions for the dataset

	## Category Mapping

	DeepFashion2 uses 13 detailed categories that are mapped to yainage90's 7 categories:

	\| DeepFashion2 Category \| yainage90 Category \|
	\|----------------------\|-------------------\|
	\| short_sleeved_shirt \| top \|
	\| long_sleeved_shirt \| top \|
	\| short_sleeved_outwear\| outer \|
	\| long_sleeved_outwear \| outer \|
	\| vest \| top \|
	\| sling \| top \|
	\| shorts \| bottom \|
	\| trousers \| bottom \|
	\| skirt \| bottom \|
	\| short_sleeved_dress \| dress \|
	\| long_sleeved_dress \| dress \|
	\| vest_dress \| dress \|
	\| sling_dress \| dress \|

	## Setup Instructions

	### 1. Download the Dataset

	The DeepFashion2 dataset requires manual download due to licensing requirements:

	1. Visit the official repository: https://github.com/switchablenorms/DeepFashion2
	2. Follow the dataset download instructions
	3. Register and download the dataset files

	### 2. Dataset Structure

	Extract the dataset to `./data/deepfashion2/` with the following structure:

	```
	deepfashion2/
	├── train/
	│ ├── image/ # Training images
	│ └── annos/ # Training annotations (JSON)
	├── validation/
	│ ├── image/ # Validation images
	│ └── annos/ # Validation annotations (JSON)
	└── test/
	├── image/ # Test images
	└── annos/ # Test annotations (JSON)
	```

	### 3. Install Dependencies

	Install additional dependencies for evaluation:

	```bash
	pip install scikit-learn matplotlib seaborn
	```

	### 4. Verify Setup

	Check the integration status:

	```bash
	curl http://localhost:7861/deepfashion2/status
	```

	## Usage Examples

	### 1. Basic Dataset Loading

	```python
	from deepfashion2_utils import DeepFashion2Config, DeepFashion2Dataset

	config = DeepFashion2Config()
	dataset = DeepFashion2Dataset(
	root_dir=config.dataset_root,
	split='validation',
	load_annotations=True
	)

	# Get a sample
	sample = dataset[0]
	print(f"Image: {sample['image_path']}")
	print(f"Categories: {dataset.get_categories_in_image(sample['annotations'])}")
	```

	### 2. Running Evaluation

	```python
	from deepfashion2_evaluation import run_full_evaluation
	from fast import analyzer

	# Run evaluation with 100 samples
	report_path = run_full_evaluation(analyzer, max_samples=100)
	print(f"Evaluation report saved to: {report_path}")
	```

	### 3. API Usage

	```bash
	# Check status
	curl -X GET "http://localhost:7861/deepfashion2/status"

	# Get dataset statistics
	curl -X GET "http://localhost:7861/deepfashion2/statistics"

	# Run evaluation
	curl -X POST "http://localhost:7861/deepfashion2/evaluate?max_samples=50"

	# Get setup instructions
	curl -X GET "http://localhost:7861/deepfashion2/setup-instructions"
	```

	## Evaluation Metrics

	### Detection Accuracy
	- Category-level accuracy: How well the model detects clothing categories
	- Detection score: IoU-like metric for category overlap
	- Confusion matrix: Detailed breakdown of predictions vs ground truth

	### Feature Quality
	- Feature dimension: Dimensionality of extracted features
	- Intra-category similarity: How similar features are within the same category
	- Inter-category distance: How well features separate different categories
	- Feature separability: Overall quality metric for feature discrimination

	## Configuration Options

	### DeepFashion2Config

	```python
	@dataclass
	class DeepFashion2Config:
	dataset_root: str = "./data/deepfashion2"
	categories: List[str] = None # Auto-populated with 13 categories
	image_size: Tuple[int, int] = (224, 224)
	batch_size: int = 32
	num_workers: int = 4
	```

	### Customization

	You can customize the configuration for your specific needs:

	```python
	config = DeepFashion2Config(
	dataset_root="/path/to/your/deepfashion2",
	image_size=(256, 256),
	batch_size=16
	)
	```

	## Performance Considerations

	### Memory Usage
	- The dataset is large (~15GB), ensure sufficient disk space
	- Use appropriate batch sizes based on available GPU memory
	- Consider using `num_workers` for faster data loading

	### CPU Optimization
	- The system automatically detects CPU vs GPU and optimizes accordingly
	- CPU inference uses float32 precision and limited threads
	- GPU inference uses float16 precision for better performance

	### Evaluation Speed
	- Limit `max_samples` for faster evaluation during development
	- Full evaluation on the entire validation set may take significant time
	- Consider running evaluations on a subset for quick feedback

	## Troubleshooting

	### Common Issues

	1. Dataset not found: Ensure the dataset is extracted to the correct path
	2. Permission errors: Check file permissions for the dataset directory
	3. Memory errors: Reduce batch size or number of workers
	4. Import errors: Install missing dependencies (scikit-learn, matplotlib, seaborn)

	### Debug Mode

	Enable debug logging to troubleshoot issues:

	```python
	import logging
	logging.basicConfig(level=logging.DEBUG)
	```

	## Future Enhancements

	### Planned Features
	- Training Pipeline: Fine-tune models on DeepFashion2 data
	- Advanced Metrics: Add more sophisticated evaluation metrics
	- Visualization Tools: Enhanced plotting and analysis tools
	- Benchmark Comparisons: Compare against other fashion datasets

	### Contributing

	To contribute to the DeepFashion2 integration:

	1. Fork the repository
	2. Create a feature branch
	3. Add tests for new functionality
	4. Submit a pull request

	## References

	- [DeepFashion2 Paper](https://arxiv.org/abs/1901.07973)
	- [DeepFashion2 Repository](https://github.com/switchablenorms/DeepFashion2)
	- [yainage90 Models](https://huggingface.co/yainage90)

	## License

	This integration follows the same license as the main Vestiq project. The DeepFashion2 dataset has its own licensing terms that must be respected.