File size: 4,237 Bytes
f27d0c4 29b2ecd f27d0c4 29b2ecd f27d0c4 29b2ecd f27d0c4 29b2ecd f27d0c4 29b2ecd f27d0c4 29b2ecd f27d0c4 29b2ecd f27d0c4 29b2ecd f27d0c4 29b2ecd f27d0c4 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 |
# ResNet-50 Fine-Tuned Model for Vehicle Type Classification
This repository hosts a **fine-tuned ResNet-50 model** for **Vehicle Type Classification**, trained on a subset of the **MIO-TCD Traffic Dataset**. This model is designed for **traffic management applications**, enabling real-time and accurate recognition of different vehicle types, such as cars, trucks, buses, and motorcycles.
## Model Details
- **Model Architecture:** ResNet-50
- **Task:** Vehicle Type Classification
- **Dataset:** MIO-TCD (Subset from Kaggle: `miotcd-dataset-50000-imagesclassification`)
- **Number of Classes:** 11 vehicle categories
- **Fine-tuning Framework:** PyTorch (`torchvision.models.resnet50`)
- **Optimization:** Trained with Adam optimizer and data augmentation for robust performance
## Usage
### Installation
Ensure you have the required dependencies installed:
```sh
pip install torch torchvision pillow
```
### Loading the Model
```python
import torch
import torchvision.models as models
import torchvision.transforms as transforms
from PIL import Image
# Define the model architecture
resnet50 = models.resnet50(pretrained=False)
# Modify the last layer to match the number of classes (11)
num_ftrs = resnet50.fc.in_features
resnet50.fc = torch.nn.Linear(num_ftrs, 11)
# Load trained model weights
resnet50.load_state_dict(torch.load("fine_tuned_model/pytorch_model.bin"))
resnet50.eval() # Set model to evaluation mode
print("Model loaded successfully!")
# Load class names
with open("fine_tuned_model/classes.txt", "r") as f:
class_names = f.read().splitlines()
print("Classes:", class_names)
# Define image transformations (same as training)
transform = transforms.Compose([
transforms.Resize((224, 224)), # Resize to match ResNet-50 input size
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) # Normalization
])
# Load the custom image
image_path = "/kaggle/input/sample-image-1/pickup_truck_sample_image.jpg" # Change this to your test image path
image = Image.open(image_path).convert("RGB") # Open image and convert to RGB
input_tensor = transform(image).unsqueeze(0) # Add batch dimension
# Move to GPU if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
resnet50 = resnet50.to(device)
input_tensor = input_tensor.to(device)
# Get predictions
with torch.no_grad():
outputs = resnet50(input_tensor)
_, predicted_class = torch.max(outputs, 1) # Get the class with highest score
# Print the result
print(f"Predicted Vehicle Type: {class_names[predicted_class.item()]}")
```
## Performance Metrics
- **Validation Accuracy:** High accuracy achieved on the test dataset
- **Inference Speed:** Optimized for real-time classification
- **Robustness:** Trained with data augmentation to handle variations in lighting and angles
## Dataset Details
The dataset consists of **50,000 images** across **11 vehicle types**, structured in the following folders:
- **articulated_truck**
- **bicycle**
- **bus**
- **car**
- **motorcycle**
- **non-motorized_vehicle**
- **pedestrian**
- **pickup_truck**
- **single_unit_truck**
- **work_van**
- **unknown**
### Training Details
- **Number of Epochs:** 10
- **Batch Size:** 32
- **Optimizer:** Adam
- **Learning Rate:** 1e-4
- **Loss Function:** Cross-Entropy Loss
- **Data Augmentation:** Horizontal flipping, random cropping, normalization
## Repository Structure
```
.
βββ fine_tuned_model/ # Contains the fine-tuned model files
β βββ pytorch_model.bin # Model weights
β βββ classes.txt # Class labels
βββ dataset/ # Training dataset (MIO-TCD subset)
βββ scripts/ # Training and evaluation scripts
βββ README.md # Model documentation
```
## Limitations
- The model is trained specifically on the **MIO-TCD dataset** and may not generalize well to images from different sources.
- Accuracy may vary based on real-world conditions such as lighting, occlusion, and camera angles.
- Requires GPU for faster inference.
## Contributing
Contributions are welcome! If you have suggestions for improvement, feel free to submit a pull request or open an issue.
|