drive-paddy / README.md
Testys's picture
Update README.md
28bd3c6 verified

A newer version of the Gradio SDK is available: 5.35.0

Upgrade
metadata
title: Drive Paddy - Drowsiness Detection
emoji: πŸš—
colorFrom: green
colorTo: blue
sdk: gradio
app_file: app.py
pinned: false
license: mit
sdk_version: 5.34.0
Car Emoji

Drive Paddy

Your AI-Powered Drowsiness Detection Assistant

License Python Streamlit

A real-time system to enhance driver safety by detecting signs of drowsiness using advanced computer vision and deep learning techniques.


🌟 Features

Drive Paddy employs a sophisticated, multi-faceted approach to ensure robust and reliable drowsiness detection.

  • Hybrid Detection Strategy: Combines traditional computer vision techniques with deep learning for superior accuracy.
  • Multi-Signal Analysis:
    • πŸ‘€ Eye Closure Detection: Measures Eye Aspect Ratio (EAR) to detect prolonged blinks and microsleeps.
    • πŸ₯± Yawn Detection: Measures Mouth Aspect Ratio (MAR) to identify driver fatigue.
    • 😴 Head Pose Estimation: Tracks head pitch and yaw to detect nodding off or inattentiveness.
    • 🧠 Deep Learning Inference: Utilizes a pre-trained EfficientNet-B7 model for an additional layer of analysis.
  • Dynamic AI-Powered Alerts: Leverages the Gemini API and gTTS for clear, context-aware voice alerts that are played directly in the user's browser.
  • Low-Light Warning: Automatically detects poor lighting conditions that could affect detection accuracy and notifies the user.
  • Web-Based Interface: Built with Streamlit for a user-friendly and accessible experience.
  • Configurable: All detection thresholds and model weights can be easily tuned via a central config.yaml file.

πŸ› οΈ How It Works

The system processes a live video feed from the user's webcam and calculates a weighted "drowsiness score" based on the configured detection strategy.

  1. Video Processing: The streamlit-webrtc component captures the camera feed.
  2. Concurrent Detection: The HybridProcessor runs two pipelines in parallel for maximum efficiency:
    • Geometric Analysis (geometric.py): Uses MediaPipe to detect facial landmarks and calculate EAR, MAR, and head position in real-time.
    • Deep Learning Inference (cnn_model.py): Uses a dlib face detector and a PyTorch model to classify the driver's state. This is run on a set interval to optimize performance.
  3. Scoring & Alerting: The results are weighted and combined. If the score exceeds a set threshold, an alert is triggered.
  4. AI Voice Generation: The GeminiAlertSystem sends a prompt to the Gemini API, generates a unique voice message using gTTS, and sends the audio data to the browser for playback.

πŸš€ Setup and Installation

Follow these steps to set up and run Drive Paddy on your local machine.

1. Clone the Repository

git clone https://github.com/<dev-tyta>/drive-paddy.git
cd drive-paddy

2. Set Up a Virtual Environment (Recommended)

python -m venv venv
source venv/bin/activate  # On Windows, use `venv\Scripts\activate`

3. Install Dependencies

Install all required Python packages from the requirements.txt file.

pip install -r requirements.txt

4. Download the CNN Model

Run the provided script to download the pre-trained model from Hugging Face Hub.

python download_model.py

5. Configure Environment Variables

Create a .env file by copying the example file.

cp .env.example .env

Now, open the .env file and add your Gemini API key:

GEMINI_API_KEY="YOUR_GEMINI_API_KEY_GOES_HERE"

βš™οΈ Configuration

The application's behavior can be fine-tuned via the config.yaml file. You can adjust detection thresholds, change the detection strategy (geometric, cnn_model, or hybrid), and modify the weights for the hybrid scoring system without touching the source code.


▢️ Usage

To run the application, execute the following command from the root directory of the project:

streamlit run drive_paddy/main.py

Open your web browser and navigate to the local URL provided by Streamlit (usually http://localhost:8501).