Spaces:
Sleeping
Sleeping
File size: 5,673 Bytes
f5aec45 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 |
# π₯ Video Person Detection & Tracking with ReID
A sophisticated computer vision application that combines YOLOv8, InsightFace, and TorchReID for robust person detection, tracking, and re-identification in videos. The application provides a user-friendly Gradio interface for easy video processing.
## π§ Technology Stack
- **YOLOv8**: Real-time person detection
- **ByteTrack**: Multi-object tracking algorithm
- **InsightFace**: Facial feature extraction for person identification
- **OSNet**: Full-body re-identification features
- **Gradio**: Web-based user interface
## π Features
- Real-time person detection and tracking
- Consistent person re-identification across frames
- Face and body feature extraction
- Interactive web interface
- JSON export of tracking data
- Support for multiple video formats
## π Quick Start
### Prerequisites
**System Requirements:**
- Python 3.8 or higher
- CUDA-compatible GPU (recommended for better performance)
- At least 4GB RAM
- 2GB free disk space
**Platform-Specific Dependencies:**
**Linux:**
```bash
# Install g++ compiler (required for InsightFace)
sudo apt-get update
sudo apt-get install g++ build-essential
```
**Windows:**
- Install [Microsoft Visual C++ Redistributable](https://aka.ms/vs/17/release/vc_redist.x64.exe) (latest version)
- Ensure you have Visual Studio Build Tools or Visual Studio Community installed
**macOS:**
```bash
# Install Xcode command line tools
xcode-select --install
```
### Installation
1. **Clone the repository:**
```bash
git clone [email protected]:zebshah7851/object-detection-and-tracking.git
cd video-person-tracking
```
2. **Create a virtual environment:**
```bash
python -m venv venv
# Activate virtual environment
# On Windows:
venv\Scripts\activate
# On Linux/macOS:
source venv/bin/activate
```
3. **Install dependencies:**
```bash
pip install --upgrade pip
pip install -r requirements.txt
```
**Note:** The installation process may take 10-15 minutes due to large model downloads (PyTorch, CUDA libraries, etc.).
### Model Setup
The application requires several pre-trained models:
1. **YOLOv8 Detection Model:**
- Place your trained `detection.pt` model file in the project root directory
- Alternatively, the app will download a default YOLOv8 model on first run
2. **InsightFace Model:**
- The `buffalo_l` model will be automatically downloaded on first run
- Requires ~2GB of storage space
3. **TorchReID Model:**
- The `osnet_x0_25` model will be automatically downloaded
- Pre-trained on Market1501 dataset
### Running the Application
1. **Start the Gradio interface:**
```bash
python app.py
```
2. **Access the web interface:**
- Open your browser and navigate to: `http://127.0.0.1:7860`
- The interface will load automatically
3. **Process videos:**
- Upload a video file (MP4, AVI, MOV, WEBM)
- Click "π Process Video"
- Download the processed video and tracking data
## π Project Structure
```
video-person-tracking/
βββ app.py # Gradio web interface
βββ detection.py # Core detection script
βββ requirements.txt # Python dependencies
βββ README.md # This file
βββ outputs/ # Generated output files
βββ detection.pt # YOLOv8 model to detect persons
βββ logs/ # Application logs
```
## π§ Configuration
### Model Parameters
You can adjust the following parameters in `app.py`:
```python
DETECTION_THRESHOLD = 0.75 # Person detection confidence threshold
SIMILARITY_THRESHOLD = 0.6 # Person re-identification threshold
```
### Performance Optimization
**For GPU acceleration:**
- Ensure CUDA is properly installed
- The application automatically detects and uses GPU if available
- Monitor GPU memory usage for large videos
**For CPU-only systems:**
- Reduce video resolution before processing
- Process shorter video segments
- Expect longer processing times
## π Output Format
### Processed Video
- Annotated video with bounding boxes
- Consistent person IDs across frames
- Real-time tracking visualization
### JSON Tracking Data
```json
{
"metadata": {
"total_frames": 1500,
"total_people": 5,
"id_mapping": {...}
},
"frames": [
{
"frame": 1,
"people": [
{
"person_id": 1,
"center_x": 320.5,
"center_y": 240.0,
"confidence": 0.85,
"bbox": {"x1": 100, "y1": 50, "x2": 200, "y2": 300}
}
]
}
]
}
```
## π Troubleshooting
### Common Issues
**Installation Problems:**
1. **InsightFace installation fails:**
```bash
# Try installing with specific version
pip install insightface==0.7.3
pip install onnxruntime-gpu==1.14.1
```
If you running linux, you need to install g++. If running on windows, you will need to install latest Visual C++ Redistributions.
2. **Model download issues:**
- Check internet connection
- Manually download models if automatic download fails
- Ensure sufficient disk space
**Runtime Issues:**
1. **Video won't load in browser:**
- Try downloading the output video manually
- Check browser compatibility
- Clear browser cache
2. **Slow processing:**
- Use GPU acceleration if available
- Reduce detection threshold
- Process shorter video segments
3. **High memory usage:**
- Monitor system resources
- Close unnecessary applications
- Use smaller input videos
## π System Requirements
- **CPU:** Intel i5 or AMD Ryzen 5 (4 cores)
- **RAM:** 8GB
- **Storage:** 5GB free space
- **GPU:** Optional, but recommended for faster processing
|