metadata

title: YoloV3 PascalVOC Dataset
emoji: 🐠
colorFrom: green
colorTo: blue
sdk: gradio
sdk_version: 3.41.1
app_file: app.py
pinned: false
license: mit

Object Detection App for Yolov3 model using Pascal VOC dataset

How to Use the App

The app has one tab of images:
- Examples: In this tab, the app displays a gallery of images from Pascal VOC dataset. We can upload any one of the images provided in the list shown below to the app widget screen. We can choose the IoU threshold, threshold using the given sliders. Then we click on submit for the inference to show the objects detected in the image and the corresponding top classes along with their confidence levels. We visualize the class activation maps using GradCAM, show/hide the GradCAM overlay by controling the transparency of the overlay, second to the last layer in the Yolo v3 network architecture is used for displaying the GradCam in this application.
Examples Tab - Description of the options to choose:
- Input Image: Select one of the example images from the given list.
- IoU threshold: Move the Slider to the float value between 0 to 1 to get the best one bounding box that covers the object the most. The default value for this is different for each of the images in the gallery. And the default is set in a way that redundant bounding boxes are not cluttering the image.
- Threshold: Move the Slider to the float value between 0 to 1 to set the best confidence threshold for the object to be detected. The default value for this is different for each of the images in the gallery.
- Enable GradCAM: Check this box to display the GradCAM overlay on the input image. Uncheck it to view only the original image.
- Transparency: Control the transparency of the GradCAM overlay. The default value is 0.6.
After adjusting the settings, click the "Submit" button to see the results.

Source code of training the model

The main code using which training was performed can be viewed at below location:

https://github.com/mHemaAP/S13

Credits

This app is built using the Gradio library (https://www.gradio.app/) for interactive model interfaces.
The PyTorch library (https://pytorch.org/) is used for the deep learning model and GradCAM visualization.
The PASCAL VOC dataset (https://www.kaggle.com/datasets/aladdinpersson/pascal-voc-dataset-used-in-yolov3-video) is used for training and evaluation.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference