|
# ResNet-50 Fine-Tuned Model for Vehicle Type Classification |
|
|
|
This repository hosts a **fine-tuned ResNet-50 model** for **Vehicle Type Classification**, trained on a subset of the **MIO-TCD Traffic Dataset**. This model is designed for **traffic management applications**, enabling real-time and accurate recognition of different vehicle types, such as cars, trucks, buses, and motorcycles. |
|
|
|
## Model Details |
|
|
|
- **Model Architecture:** ResNet-50 |
|
- **Task:** Vehicle Type Classification |
|
- **Dataset:** MIO-TCD (Subset from Kaggle: `miotcd-dataset-50000-imagesclassification`) |
|
- **Number of Classes:** 11 vehicle categories |
|
- **Fine-tuning Framework:** PyTorch (`torchvision.models.resnet50`) |
|
- **Optimization:** Trained with Adam optimizer and data augmentation for robust performance |
|
|
|
|
|
## Usage |
|
|
|
### Installation |
|
|
|
Ensure you have the required dependencies installed: |
|
|
|
```sh |
|
pip install torch torchvision pillow |
|
``` |
|
|
|
### Loading the Model |
|
|
|
```python |
|
import torch |
|
import torchvision.models as models |
|
import torchvision.transforms as transforms |
|
from PIL import Image |
|
|
|
# Define the model architecture |
|
resnet50 = models.resnet50(pretrained=False) |
|
|
|
# Modify the last layer to match the number of classes (11) |
|
num_ftrs = resnet50.fc.in_features |
|
resnet50.fc = torch.nn.Linear(num_ftrs, 11) |
|
|
|
# Load trained model weights |
|
resnet50.load_state_dict(torch.load("fine_tuned_model/pytorch_model.bin")) |
|
resnet50.eval() # Set model to evaluation mode |
|
|
|
print("Model loaded successfully!") |
|
|
|
|
|
# Load class names |
|
with open("fine_tuned_model/classes.txt", "r") as f: |
|
class_names = f.read().splitlines() |
|
|
|
print("Classes:", class_names) |
|
|
|
|
|
# Define image transformations (same as training) |
|
transform = transforms.Compose([ |
|
transforms.Resize((224, 224)), # Resize to match ResNet-50 input size |
|
transforms.ToTensor(), |
|
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) # Normalization |
|
]) |
|
|
|
# Load the custom image |
|
image_path = "/kaggle/input/sample-image-1/pickup_truck_sample_image.jpg" # Change this to your test image path |
|
image = Image.open(image_path).convert("RGB") # Open image and convert to RGB |
|
input_tensor = transform(image).unsqueeze(0) # Add batch dimension |
|
|
|
|
|
|
|
# Move to GPU if available |
|
device = torch.device("cuda" if torch.cuda.is_available() else "cpu") |
|
resnet50 = resnet50.to(device) |
|
input_tensor = input_tensor.to(device) |
|
|
|
# Get predictions |
|
with torch.no_grad(): |
|
outputs = resnet50(input_tensor) |
|
_, predicted_class = torch.max(outputs, 1) # Get the class with highest score |
|
|
|
# Print the result |
|
print(f"Predicted Vehicle Type: {class_names[predicted_class.item()]}") |
|
|
|
``` |
|
|
|
## Performance Metrics |
|
|
|
- **Validation Accuracy:** High accuracy achieved on the test dataset |
|
- **Inference Speed:** Optimized for real-time classification |
|
- **Robustness:** Trained with data augmentation to handle variations in lighting and angles |
|
|
|
## Dataset Details |
|
|
|
The dataset consists of **50,000 images** across **11 vehicle types**, structured in the following folders: |
|
- **articulated_truck** |
|
- **bicycle** |
|
- **bus** |
|
- **car** |
|
- **motorcycle** |
|
- **non-motorized_vehicle** |
|
- **pedestrian** |
|
- **pickup_truck** |
|
- **single_unit_truck** |
|
- **work_van** |
|
- **unknown** |
|
|
|
### Training Details |
|
|
|
- **Number of Epochs:** 10 |
|
- **Batch Size:** 32 |
|
- **Optimizer:** Adam |
|
- **Learning Rate:** 1e-4 |
|
- **Loss Function:** Cross-Entropy Loss |
|
- **Data Augmentation:** Horizontal flipping, random cropping, normalization |
|
|
|
## Repository Structure |
|
|
|
``` |
|
. |
|
βββ fine_tuned_model/ # Contains the fine-tuned model files |
|
β βββ pytorch_model.bin # Model weights |
|
β βββ classes.txt # Class labels |
|
βββ dataset/ # Training dataset (MIO-TCD subset) |
|
βββ scripts/ # Training and evaluation scripts |
|
βββ README.md # Model documentation |
|
``` |
|
|
|
## Limitations |
|
|
|
- The model is trained specifically on the **MIO-TCD dataset** and may not generalize well to images from different sources. |
|
- Accuracy may vary based on real-world conditions such as lighting, occlusion, and camera angles. |
|
- Requires GPU for faster inference. |
|
|
|
## Contributing |
|
|
|
Contributions are welcome! If you have suggestions for improvement, feel free to submit a pull request or open an issue. |
|
|
|
|
|
|