File size: 5,673 Bytes
f5aec45
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
# πŸŽ₯ Video Person Detection & Tracking with ReID

A sophisticated computer vision application that combines YOLOv8, InsightFace, and TorchReID for robust person detection, tracking, and re-identification in videos. The application provides a user-friendly Gradio interface for easy video processing.

## πŸ”§ Technology Stack

- **YOLOv8**: Real-time person detection
- **ByteTrack**: Multi-object tracking algorithm  
- **InsightFace**: Facial feature extraction for person identification
- **OSNet**: Full-body re-identification features
- **Gradio**: Web-based user interface

## πŸ“‹ Features

- Real-time person detection and tracking
- Consistent person re-identification across frames
- Face and body feature extraction
- Interactive web interface
- JSON export of tracking data
- Support for multiple video formats

## πŸš€ Quick Start

### Prerequisites

**System Requirements:**
- Python 3.8 or higher
- CUDA-compatible GPU (recommended for better performance)
- At least 4GB RAM
- 2GB free disk space

**Platform-Specific Dependencies:**

**Linux:**
```bash
# Install g++ compiler (required for InsightFace)
sudo apt-get update
sudo apt-get install g++ build-essential
```

**Windows:**
- Install [Microsoft Visual C++ Redistributable](https://aka.ms/vs/17/release/vc_redist.x64.exe) (latest version)
- Ensure you have Visual Studio Build Tools or Visual Studio Community installed

**macOS:**
```bash
# Install Xcode command line tools
xcode-select --install
```

### Installation

1. **Clone the repository:**
```bash
git clone [email protected]:zebshah7851/object-detection-and-tracking.git
cd video-person-tracking
```

2. **Create a virtual environment:**
```bash
python -m venv venv

# Activate virtual environment
# On Windows:
venv\Scripts\activate
# On Linux/macOS:
source venv/bin/activate
```

3. **Install dependencies:**
```bash
pip install --upgrade pip
pip install -r requirements.txt
```

**Note:** The installation process may take 10-15 minutes due to large model downloads (PyTorch, CUDA libraries, etc.).

### Model Setup

The application requires several pre-trained models:

1. **YOLOv8 Detection Model:**
   - Place your trained `detection.pt` model file in the project root directory
   - Alternatively, the app will download a default YOLOv8 model on first run

2. **InsightFace Model:**
   - The `buffalo_l` model will be automatically downloaded on first run
   - Requires ~2GB of storage space

3. **TorchReID Model:**
   - The `osnet_x0_25` model will be automatically downloaded
   - Pre-trained on Market1501 dataset

### Running the Application

1. **Start the Gradio interface:**
```bash
python app.py
```

2. **Access the web interface:**
   - Open your browser and navigate to: `http://127.0.0.1:7860`
   - The interface will load automatically

3. **Process videos:**
   - Upload a video file (MP4, AVI, MOV, WEBM)
   - Click "πŸš€ Process Video"
   - Download the processed video and tracking data

## πŸ“ Project Structure

```
video-person-tracking/
β”œβ”€β”€ app.py                 # Gradio web interface
β”œβ”€β”€ detection.py           # Core detection script
β”œβ”€β”€ requirements.txt       # Python dependencies
β”œβ”€β”€ README.md              # This file
β”œβ”€β”€ outputs/               # Generated output files
β”œβ”€β”€ detection.pt           # YOLOv8 model to detect persons
└── logs/                  # Application logs
```

## πŸ”§ Configuration

### Model Parameters

You can adjust the following parameters in `app.py`:

```python
DETECTION_THRESHOLD = 0.75  # Person detection confidence threshold
SIMILARITY_THRESHOLD = 0.6  # Person re-identification threshold
```

### Performance Optimization

**For GPU acceleration:**
- Ensure CUDA is properly installed
- The application automatically detects and uses GPU if available
- Monitor GPU memory usage for large videos

**For CPU-only systems:**
- Reduce video resolution before processing
- Process shorter video segments
- Expect longer processing times

## πŸ“Š Output Format

### Processed Video
- Annotated video with bounding boxes
- Consistent person IDs across frames
- Real-time tracking visualization

### JSON Tracking Data
```json
{
  "metadata": {
    "total_frames": 1500,
    "total_people": 5,
    "id_mapping": {...}
  },
  "frames": [
    {
      "frame": 1,
      "people": [
        {
          "person_id": 1,
          "center_x": 320.5,
          "center_y": 240.0,
          "confidence": 0.85,
          "bbox": {"x1": 100, "y1": 50, "x2": 200, "y2": 300}
        }
      ]
    }
  ]
}
```

## πŸ› Troubleshooting

### Common Issues

**Installation Problems:**

1. **InsightFace installation fails:**
   ```bash
   # Try installing with specific version
   pip install insightface==0.7.3
   pip install onnxruntime-gpu==1.14.1
   ```
   
   If you running linux, you need to install g++. If running on windows, you will need to install latest Visual C++ Redistributions.


2. **Model download issues:**
   - Check internet connection
   - Manually download models if automatic download fails
   - Ensure sufficient disk space

**Runtime Issues:**

1. **Video won't load in browser:**
   - Try downloading the output video manually
   - Check browser compatibility
   - Clear browser cache

2. **Slow processing:**
   - Use GPU acceleration if available
   - Reduce detection threshold
   - Process shorter video segments

3. **High memory usage:**
   - Monitor system resources
   - Close unnecessary applications
   - Use smaller input videos

## πŸ“ System Requirements

- **CPU:** Intel i5 or AMD Ryzen 5 (4 cores)
- **RAM:** 8GB
- **Storage:** 5GB free space
- **GPU:** Optional, but recommended for faster processing