Upload 7 files
Browse files- README.md +160 -0
- config (1).json +1 -0
- model (2).safetensors +3 -0
- pytorch_model.bin +3 -0
- special_tokens_map (1).json +1 -0
- tokenizer_config (1).json +1 -0
- vocab (1).txt +7 -0
README.md
ADDED
@@ -0,0 +1,160 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Gender Classification Quantized Model
|
2 |
+
|
3 |
+
This repository hosts a quantized version of a feedforward neural network model, fine-tuned for gender classification tasks. The model has been optimized for efficient deployment while maintaining high accuracy, making it suitable for resource-constrained environments.
|
4 |
+
|
5 |
+
---
|
6 |
+
|
7 |
+
## Model Details
|
8 |
+
|
9 |
+
- **Model Name:** Gender Classifier
|
10 |
+
- **Model Architecture:** 2-layer MLP (Multi-Layer Perceptron)
|
11 |
+
- **Task:** Gender Classification
|
12 |
+
- **Dataset:** Gender Classification Dataset v7
|
13 |
+
- **Quantization:** QInt8 (Dynamic Quantization)
|
14 |
+
- **Framework:** PyTorch
|
15 |
+
|
16 |
+
---
|
17 |
+
|
18 |
+
## Usage
|
19 |
+
|
20 |
+
### Installation
|
21 |
+
|
22 |
+
```bash
|
23 |
+
pip install torch pandas scikit-learn numpy
|
24 |
+
```
|
25 |
+
|
26 |
+
### Loading the Quantized Model
|
27 |
+
|
28 |
+
```python
|
29 |
+
import torch
|
30 |
+
import torch.nn as nn
|
31 |
+
import pandas as pd
|
32 |
+
import numpy as np
|
33 |
+
from sklearn.preprocessing import StandardScaler, LabelEncoder
|
34 |
+
import json
|
35 |
+
|
36 |
+
# Define the model architecture
|
37 |
+
class GenderClassifier(nn.Module):
|
38 |
+
def __init__(self):
|
39 |
+
super().__init__()
|
40 |
+
self.fc = nn.Sequential(
|
41 |
+
nn.Linear(7, 32),
|
42 |
+
nn.ReLU(),
|
43 |
+
nn.Linear(32, 2)
|
44 |
+
)
|
45 |
+
|
46 |
+
def forward(self, x):
|
47 |
+
return self.fc(x)
|
48 |
+
|
49 |
+
# Load the quantized model
|
50 |
+
model = GenderClassifier()
|
51 |
+
quantized_model = torch.quantization.quantize_dynamic(model, {nn.Linear}, dtype=torch.qint8)
|
52 |
+
quantized_model.load_state_dict(torch.load("quantized_model/pytorch_model.bin"))
|
53 |
+
|
54 |
+
# Load configuration
|
55 |
+
with open("quantized_model/config.json", "r") as f:
|
56 |
+
config = json.load(f)
|
57 |
+
|
58 |
+
# Example usage
|
59 |
+
# Prepare your input data (7 features)
|
60 |
+
input_data = np.array([[feature1, feature2, feature3, feature4, feature5, feature6, feature7]])
|
61 |
+
|
62 |
+
# Normalize using StandardScaler (you'll need to fit this on your training data)
|
63 |
+
scaler = StandardScaler()
|
64 |
+
# scaler.fit(your_training_data) # Fit on your training data
|
65 |
+
input_normalized = scaler.transform(input_data)
|
66 |
+
|
67 |
+
# Convert to tensor
|
68 |
+
input_tensor = torch.tensor(input_normalized, dtype=torch.float32)
|
69 |
+
|
70 |
+
# Inference
|
71 |
+
with torch.no_grad():
|
72 |
+
outputs = quantized_model(input_tensor)
|
73 |
+
|
74 |
+
# Get predicted label
|
75 |
+
predicted_class = outputs.argmax(dim=1).item()
|
76 |
+
|
77 |
+
# Map label using label encoder classes
|
78 |
+
label_mapping = {0: config["label_classes"][0], 1: config["label_classes"][1]}
|
79 |
+
print(f"Predicted Gender: {label_mapping[predicted_class]}")
|
80 |
+
```
|
81 |
+
|
82 |
+
---
|
83 |
+
|
84 |
+
## Performance Metrics
|
85 |
+
|
86 |
+
- **Model Size:** Reduced through QInt8 quantization
|
87 |
+
- **Input Features:** 7 numerical features
|
88 |
+
- **Output Classes:** 2 (Binary gender classification)
|
89 |
+
- **Training Split:** 80% train, 20% validation
|
90 |
+
|
91 |
+
---
|
92 |
+
|
93 |
+
## Training Details
|
94 |
+
|
95 |
+
### Dataset
|
96 |
+
|
97 |
+
The model was trained on the Gender Classification Dataset v7, featuring:
|
98 |
+
- 7 numerical input features
|
99 |
+
- Binary gender classification labels
|
100 |
+
- Preprocessed and normalized data
|
101 |
+
|
102 |
+
### Training Configuration
|
103 |
+
|
104 |
+
- **Epochs:** 10
|
105 |
+
- **Batch Size:** 32
|
106 |
+
- **Learning Rate:** 0.001
|
107 |
+
- **Optimizer:** Adam
|
108 |
+
- **Loss Function:** CrossEntropyLoss
|
109 |
+
- **Normalization:** StandardScaler
|
110 |
+
|
111 |
+
### Model Architecture
|
112 |
+
|
113 |
+
- **Input Layer:** 7 features
|
114 |
+
- **Hidden Layer:** 32 neurons with ReLU activation
|
115 |
+
- **Output Layer:** 2 neurons (binary classification)
|
116 |
+
- **Total Parameters:** Approximately 288 parameters
|
117 |
+
|
118 |
+
### Quantization
|
119 |
+
|
120 |
+
Post-training dynamic quantization was applied using PyTorch's built-in quantization framework to reduce the model size and improve inference efficiency with QInt8 precision.
|
121 |
+
|
122 |
+
---
|
123 |
+
|
124 |
+
## Repository Structure
|
125 |
+
|
126 |
+
```
|
127 |
+
.
|
128 |
+
βββ quantized_model/
|
129 |
+
β βββ config.json # Model configuration
|
130 |
+
β βββ pytorch_model.bin # Quantized model weights
|
131 |
+
β βββ model.safetensors # Alternative model format
|
132 |
+
β βββ vocab.txt # Feature names
|
133 |
+
β βββ tokenizer_config.json # Scaler configuration
|
134 |
+
β βββ special_tokens_map.json # Label encoder metadata
|
135 |
+
βββ gender-classification.ipynb # Training notebook
|
136 |
+
βββ README.md # Model documentation
|
137 |
+
```
|
138 |
+
|
139 |
+
---
|
140 |
+
|
141 |
+
## Input Features
|
142 |
+
|
143 |
+
The model expects 7 numerical features as input. The exact feature names and preprocessing requirements are stored in the configuration files.
|
144 |
+
|
145 |
+
---
|
146 |
+
|
147 |
+
## Limitations
|
148 |
+
|
149 |
+
- The model is designed for binary gender classification only
|
150 |
+
- Performance depends on the similarity between inference data and training data distribution
|
151 |
+
- Quantization may result in minor accuracy changes compared to full-precision models
|
152 |
+
- Requires proper feature scaling using StandardScaler fitted on training data
|
153 |
+
|
154 |
+
---
|
155 |
+
|
156 |
+
## Contributing
|
157 |
+
|
158 |
+
Contributions are welcome! Feel free to open an issue or PR for improvements, fixes, or feature extensions.
|
159 |
+
|
160 |
+
---
|
config (1).json
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
{"input_features": ["long_hair", "forehead_width_cm", "forehead_height_cm", "nose_wide", "nose_long", "lips_thin", "distance_nose_to_lip_long"], "label_classes": ["Female", "Male"], "scaling": "StandardScaler", "model_architecture": "2-layer MLP"}
|
model (2).safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:bb4973410ad93309f22dd9cda9cc407d625865c62295ca22521a4d5bd0994620
|
3 |
+
size 296
|
pytorch_model.bin
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:59b193ba017a4a9e58cbb4b97855d7cd34cd381dea45f04b1e5a237a2be9de32
|
3 |
+
size 3912
|
special_tokens_map (1).json
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
{"label_encoder": "sklearn.LabelEncoder"}
|
tokenizer_config (1).json
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
{"scaler": "StandardScaler"}
|
vocab (1).txt
ADDED
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
long_hair
|
2 |
+
forehead_width_cm
|
3 |
+
forehead_height_cm
|
4 |
+
nose_wide
|
5 |
+
nose_long
|
6 |
+
lips_thin
|
7 |
+
distance_nose_to_lip_long
|