gautamnancy's picture
Upload 7 files
dd116cb verified
# Gender Classification Quantized Model
This repository hosts a quantized version of a feedforward neural network model, fine-tuned for gender classification tasks. The model has been optimized for efficient deployment while maintaining high accuracy, making it suitable for resource-constrained environments.
---
## Model Details
- **Model Name:** Gender Classifier
- **Model Architecture:** 2-layer MLP (Multi-Layer Perceptron)
- **Task:** Gender Classification
- **Dataset:** Gender Classification Dataset v7
- **Quantization:** QInt8 (Dynamic Quantization)
- **Framework:** PyTorch
---
## Usage
### Installation
```bash
pip install torch pandas scikit-learn numpy
```
### Loading the Quantized Model
```python
import torch
import torch.nn as nn
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler, LabelEncoder
import json
# Define the model architecture
class GenderClassifier(nn.Module):
def __init__(self):
super().__init__()
self.fc = nn.Sequential(
nn.Linear(7, 32),
nn.ReLU(),
nn.Linear(32, 2)
)
def forward(self, x):
return self.fc(x)
# Load the quantized model
model = GenderClassifier()
quantized_model = torch.quantization.quantize_dynamic(model, {nn.Linear}, dtype=torch.qint8)
quantized_model.load_state_dict(torch.load("quantized_model/pytorch_model.bin"))
# Load configuration
with open("quantized_model/config.json", "r") as f:
config = json.load(f)
# Example usage
# Prepare your input data (7 features)
input_data = np.array([[feature1, feature2, feature3, feature4, feature5, feature6, feature7]])
# Normalize using StandardScaler (you'll need to fit this on your training data)
scaler = StandardScaler()
# scaler.fit(your_training_data) # Fit on your training data
input_normalized = scaler.transform(input_data)
# Convert to tensor
input_tensor = torch.tensor(input_normalized, dtype=torch.float32)
# Inference
with torch.no_grad():
outputs = quantized_model(input_tensor)
# Get predicted label
predicted_class = outputs.argmax(dim=1).item()
# Map label using label encoder classes
label_mapping = {0: config["label_classes"][0], 1: config["label_classes"][1]}
print(f"Predicted Gender: {label_mapping[predicted_class]}")
```
---
## Performance Metrics
- **Model Size:** Reduced through QInt8 quantization
- **Input Features:** 7 numerical features
- **Output Classes:** 2 (Binary gender classification)
- **Training Split:** 80% train, 20% validation
---
## Training Details
### Dataset
The model was trained on the Gender Classification Dataset v7, featuring:
- 7 numerical input features
- Binary gender classification labels
- Preprocessed and normalized data
### Training Configuration
- **Epochs:** 10
- **Batch Size:** 32
- **Learning Rate:** 0.001
- **Optimizer:** Adam
- **Loss Function:** CrossEntropyLoss
- **Normalization:** StandardScaler
### Model Architecture
- **Input Layer:** 7 features
- **Hidden Layer:** 32 neurons with ReLU activation
- **Output Layer:** 2 neurons (binary classification)
- **Total Parameters:** Approximately 288 parameters
### Quantization
Post-training dynamic quantization was applied using PyTorch's built-in quantization framework to reduce the model size and improve inference efficiency with QInt8 precision.
---
## Repository Structure
```
.
β”œβ”€β”€ quantized_model/
β”‚ β”œβ”€β”€ config.json # Model configuration
β”‚ β”œβ”€β”€ pytorch_model.bin # Quantized model weights
β”‚ β”œβ”€β”€ model.safetensors # Alternative model format
β”‚ β”œβ”€β”€ vocab.txt # Feature names
β”‚ β”œβ”€β”€ tokenizer_config.json # Scaler configuration
β”‚ └── special_tokens_map.json # Label encoder metadata
β”œβ”€β”€ gender-classification.ipynb # Training notebook
└── README.md # Model documentation
```
---
## Input Features
The model expects 7 numerical features as input. The exact feature names and preprocessing requirements are stored in the configuration files.
---
## Limitations
- The model is designed for binary gender classification only
- Performance depends on the similarity between inference data and training data distribution
- Quantization may result in minor accuracy changes compared to full-precision models
- Requires proper feature scaling using StandardScaler fitted on training data
---
## Contributing
Contributions are welcome! Feel free to open an issue or PR for improvements, fixes, or feature extensions.
---