Spaces:

Taino
/

DynamicVemesv2

Sleeping

File size: 5,673 Bytes

f5aec45

# 🎥 Video Person Detection & Tracking with ReID

A sophisticated computer vision application that combines YOLOv8, InsightFace, and TorchReID for robust person detection, tracking, and re-identification in videos. The application provides a user-friendly Gradio interface for easy video processing.

## 🔧 Technology Stack

- **YOLOv8**: Real-time person detection
- **ByteTrack**: Multi-object tracking algorithm  
- **InsightFace**: Facial feature extraction for person identification
- **OSNet**: Full-body re-identification features
- **Gradio**: Web-based user interface

## 📋 Features

- Real-time person detection and tracking
- Consistent person re-identification across frames
- Face and body feature extraction
- Interactive web interface
- JSON export of tracking data
- Support for multiple video formats

## 🚀 Quick Start

### Prerequisites

**System Requirements:**
- Python 3.8 or higher
- CUDA-compatible GPU (recommended for better performance)
- At least 4GB RAM
- 2GB free disk space

**Platform-Specific Dependencies:**

**Linux:**
```bash
# Install g++ compiler (required for InsightFace)
sudo apt-get update
sudo apt-get install g++ build-essential
```

**Windows:**
- Install [Microsoft Visual C++ Redistributable](https://aka.ms/vs/17/release/vc_redist.x64.exe) (latest version)
- Ensure you have Visual Studio Build Tools or Visual Studio Community installed

**macOS:**
```bash
# Install Xcode command line tools
xcode-select --install
```

### Installation

1. **Clone the repository:**
```bash
git clone [email protected]:zebshah7851/object-detection-and-tracking.git
cd video-person-tracking
```

2. **Create a virtual environment:**
```bash
python -m venv venv

# Activate virtual environment
# On Windows:
venv\Scripts\activate
# On Linux/macOS:
source venv/bin/activate
```

3. **Install dependencies:**
```bash
pip install --upgrade pip
pip install -r requirements.txt
```

**Note:** The installation process may take 10-15 minutes due to large model downloads (PyTorch, CUDA libraries, etc.).

### Model Setup

The application requires several pre-trained models:

1. **YOLOv8 Detection Model:**
   - Place your trained `detection.pt` model file in the project root directory
   - Alternatively, the app will download a default YOLOv8 model on first run

2. **InsightFace Model:**
   - The `buffalo_l` model will be automatically downloaded on first run
   - Requires ~2GB of storage space

3. **TorchReID Model:**
   - The `osnet_x0_25` model will be automatically downloaded
   - Pre-trained on Market1501 dataset

### Running the Application

1. **Start the Gradio interface:**
```bash
python app.py
```

2. **Access the web interface:**
   - Open your browser and navigate to: `http://127.0.0.1:7860`
   - The interface will load automatically

3. **Process videos:**
   - Upload a video file (MP4, AVI, MOV, WEBM)
   - Click "🚀 Process Video"
   - Download the processed video and tracking data

## 📁 Project Structure

```
video-person-tracking/
├── app.py                 # Gradio web interface
├── detection.py           # Core detection script
├── requirements.txt       # Python dependencies
├── README.md              # This file
├── outputs/               # Generated output files
├── detection.pt           # YOLOv8 model to detect persons
└── logs/                  # Application logs
```

## 🔧 Configuration

### Model Parameters

You can adjust the following parameters in `app.py`:

```python
DETECTION_THRESHOLD = 0.75  # Person detection confidence threshold
SIMILARITY_THRESHOLD = 0.6  # Person re-identification threshold
```

### Performance Optimization

**For GPU acceleration:**
- Ensure CUDA is properly installed
- The application automatically detects and uses GPU if available
- Monitor GPU memory usage for large videos

**For CPU-only systems:**
- Reduce video resolution before processing
- Process shorter video segments
- Expect longer processing times

## 📊 Output Format

### Processed Video
- Annotated video with bounding boxes
- Consistent person IDs across frames
- Real-time tracking visualization

### JSON Tracking Data
```json
{
  "metadata": {
    "total_frames": 1500,
    "total_people": 5,
    "id_mapping": {...}
  },
  "frames": [
    {
      "frame": 1,
      "people": [
        {
          "person_id": 1,
          "center_x": 320.5,
          "center_y": 240.0,
          "confidence": 0.85,
          "bbox": {"x1": 100, "y1": 50, "x2": 200, "y2": 300}
        }
      ]
    }
  ]
}
```

## 🐛 Troubleshooting

### Common Issues

**Installation Problems:**

1. **InsightFace installation fails:**
   ```bash
   # Try installing with specific version
   pip install insightface==0.7.3
   pip install onnxruntime-gpu==1.14.1
   ```
   
   If you running linux, you need to install g++. If running on windows, you will need to install latest Visual C++ Redistributions.


2. **Model download issues:**
   - Check internet connection
   - Manually download models if automatic download fails
   - Ensure sufficient disk space

**Runtime Issues:**

1. **Video won't load in browser:**
   - Try downloading the output video manually
   - Check browser compatibility
   - Clear browser cache

2. **Slow processing:**
   - Use GPU acceleration if available
   - Reduce detection threshold
   - Process shorter video segments

3. **High memory usage:**
   - Monitor system resources
   - Close unnecessary applications
   - Use smaller input videos

## 📝 System Requirements

- **CPU:** Intel i5 or AMD Ryzen 5 (4 cores)
- **RAM:** 8GB
- **Storage:** 5GB free space
- **GPU:** Optional, but recommended for faster processing