Spaces:
Sleeping
Sleeping
# π₯ Video Person Detection & Tracking with ReID | |
A sophisticated computer vision application that combines YOLOv8, InsightFace, and TorchReID for robust person detection, tracking, and re-identification in videos. The application provides a user-friendly Gradio interface for easy video processing. | |
## π§ Technology Stack | |
- **YOLOv8**: Real-time person detection | |
- **ByteTrack**: Multi-object tracking algorithm | |
- **InsightFace**: Facial feature extraction for person identification | |
- **OSNet**: Full-body re-identification features | |
- **Gradio**: Web-based user interface | |
## π Features | |
- Real-time person detection and tracking | |
- Consistent person re-identification across frames | |
- Face and body feature extraction | |
- Interactive web interface | |
- JSON export of tracking data | |
- Support for multiple video formats | |
## π Quick Start | |
### Prerequisites | |
**System Requirements:** | |
- Python 3.8 or higher | |
- CUDA-compatible GPU (recommended for better performance) | |
- At least 4GB RAM | |
- 2GB free disk space | |
**Platform-Specific Dependencies:** | |
**Linux:** | |
```bash | |
# Install g++ compiler (required for InsightFace) | |
sudo apt-get update | |
sudo apt-get install g++ build-essential | |
``` | |
**Windows:** | |
- Install [Microsoft Visual C++ Redistributable](https://aka.ms/vs/17/release/vc_redist.x64.exe) (latest version) | |
- Ensure you have Visual Studio Build Tools or Visual Studio Community installed | |
**macOS:** | |
```bash | |
# Install Xcode command line tools | |
xcode-select --install | |
``` | |
### Installation | |
1. **Clone the repository:** | |
```bash | |
git clone [email protected]:zebshah7851/object-detection-and-tracking.git | |
cd video-person-tracking | |
``` | |
2. **Create a virtual environment:** | |
```bash | |
python -m venv venv | |
# Activate virtual environment | |
# On Windows: | |
venv\Scripts\activate | |
# On Linux/macOS: | |
source venv/bin/activate | |
``` | |
3. **Install dependencies:** | |
```bash | |
pip install --upgrade pip | |
pip install -r requirements.txt | |
``` | |
**Note:** The installation process may take 10-15 minutes due to large model downloads (PyTorch, CUDA libraries, etc.). | |
### Model Setup | |
The application requires several pre-trained models: | |
1. **YOLOv8 Detection Model:** | |
- Place your trained `detection.pt` model file in the project root directory | |
- Alternatively, the app will download a default YOLOv8 model on first run | |
2. **InsightFace Model:** | |
- The `buffalo_l` model will be automatically downloaded on first run | |
- Requires ~2GB of storage space | |
3. **TorchReID Model:** | |
- The `osnet_x0_25` model will be automatically downloaded | |
- Pre-trained on Market1501 dataset | |
### Running the Application | |
1. **Start the Gradio interface:** | |
```bash | |
python app.py | |
``` | |
2. **Access the web interface:** | |
- Open your browser and navigate to: `http://127.0.0.1:7860` | |
- The interface will load automatically | |
3. **Process videos:** | |
- Upload a video file (MP4, AVI, MOV, WEBM) | |
- Click "π Process Video" | |
- Download the processed video and tracking data | |
## π Project Structure | |
``` | |
video-person-tracking/ | |
βββ app.py # Gradio web interface | |
βββ detection.py # Core detection script | |
βββ requirements.txt # Python dependencies | |
βββ README.md # This file | |
βββ outputs/ # Generated output files | |
βββ detection.pt # YOLOv8 model to detect persons | |
βββ logs/ # Application logs | |
``` | |
## π§ Configuration | |
### Model Parameters | |
You can adjust the following parameters in `app.py`: | |
```python | |
DETECTION_THRESHOLD = 0.75 # Person detection confidence threshold | |
SIMILARITY_THRESHOLD = 0.6 # Person re-identification threshold | |
``` | |
### Performance Optimization | |
**For GPU acceleration:** | |
- Ensure CUDA is properly installed | |
- The application automatically detects and uses GPU if available | |
- Monitor GPU memory usage for large videos | |
**For CPU-only systems:** | |
- Reduce video resolution before processing | |
- Process shorter video segments | |
- Expect longer processing times | |
## π Output Format | |
### Processed Video | |
- Annotated video with bounding boxes | |
- Consistent person IDs across frames | |
- Real-time tracking visualization | |
### JSON Tracking Data | |
```json | |
{ | |
"metadata": { | |
"total_frames": 1500, | |
"total_people": 5, | |
"id_mapping": {...} | |
}, | |
"frames": [ | |
{ | |
"frame": 1, | |
"people": [ | |
{ | |
"person_id": 1, | |
"center_x": 320.5, | |
"center_y": 240.0, | |
"confidence": 0.85, | |
"bbox": {"x1": 100, "y1": 50, "x2": 200, "y2": 300} | |
} | |
] | |
} | |
] | |
} | |
``` | |
## π Troubleshooting | |
### Common Issues | |
**Installation Problems:** | |
1. **InsightFace installation fails:** | |
```bash | |
# Try installing with specific version | |
pip install insightface==0.7.3 | |
pip install onnxruntime-gpu==1.14.1 | |
``` | |
If you running linux, you need to install g++. If running on windows, you will need to install latest Visual C++ Redistributions. | |
2. **Model download issues:** | |
- Check internet connection | |
- Manually download models if automatic download fails | |
- Ensure sufficient disk space | |
**Runtime Issues:** | |
1. **Video won't load in browser:** | |
- Try downloading the output video manually | |
- Check browser compatibility | |
- Clear browser cache | |
2. **Slow processing:** | |
- Use GPU acceleration if available | |
- Reduce detection threshold | |
- Process shorter video segments | |
3. **High memory usage:** | |
- Monitor system resources | |
- Close unnecessary applications | |
- Use smaller input videos | |
## π System Requirements | |
- **CPU:** Intel i5 or AMD Ryzen 5 (4 cores) | |
- **RAM:** 8GB | |
- **Storage:** 5GB free space | |
- **GPU:** Optional, but recommended for faster processing | |