File size: 4,727 Bytes
dd116cb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
# Gender Classification Quantized Model

This repository hosts a quantized version of a feedforward neural network model, fine-tuned for gender classification tasks. The model has been optimized for efficient deployment while maintaining high accuracy, making it suitable for resource-constrained environments.

---

## Model Details

- **Model Name:** Gender Classifier   
- **Model Architecture:** 2-layer MLP (Multi-Layer Perceptron)  
- **Task:** Gender Classification  
- **Dataset:** Gender Classification Dataset v7  
- **Quantization:** QInt8 (Dynamic Quantization)  
- **Framework:** PyTorch

---

## Usage

### Installation

```bash

pip install torch pandas scikit-learn numpy

```

### Loading the Quantized Model

```python

import torch

import torch.nn as nn

import pandas as pd

import numpy as np

from sklearn.preprocessing import StandardScaler, LabelEncoder

import json



# Define the model architecture

class GenderClassifier(nn.Module):

    def __init__(self):

        super().__init__()

        self.fc = nn.Sequential(

            nn.Linear(7, 32),

            nn.ReLU(),

            nn.Linear(32, 2)

        )



    def forward(self, x):

        return self.fc(x)



# Load the quantized model

model = GenderClassifier()

quantized_model = torch.quantization.quantize_dynamic(model, {nn.Linear}, dtype=torch.qint8)

quantized_model.load_state_dict(torch.load("quantized_model/pytorch_model.bin"))



# Load configuration

with open("quantized_model/config.json", "r") as f:

    config = json.load(f)



# Example usage

# Prepare your input data (7 features)

input_data = np.array([[feature1, feature2, feature3, feature4, feature5, feature6, feature7]])



# Normalize using StandardScaler (you'll need to fit this on your training data)

scaler = StandardScaler()

# scaler.fit(your_training_data)  # Fit on your training data

input_normalized = scaler.transform(input_data)



# Convert to tensor

input_tensor = torch.tensor(input_normalized, dtype=torch.float32)



# Inference

with torch.no_grad():

    outputs = quantized_model(input_tensor)



# Get predicted label

predicted_class = outputs.argmax(dim=1).item()



# Map label using label encoder classes

label_mapping = {0: config["label_classes"][0], 1: config["label_classes"][1]}

print(f"Predicted Gender: {label_mapping[predicted_class]}")

```

---

## Performance Metrics

- **Model Size:** Reduced through QInt8 quantization
- **Input Features:** 7 numerical features
- **Output Classes:** 2 (Binary gender classification)
- **Training Split:** 80% train, 20% validation

---

## Training Details

### Dataset

The model was trained on the Gender Classification Dataset v7, featuring:
- 7 numerical input features
- Binary gender classification labels
- Preprocessed and normalized data

### Training Configuration

- **Epochs:** 10  
- **Batch Size:** 32  
- **Learning Rate:** 0.001  
- **Optimizer:** Adam  
- **Loss Function:** CrossEntropyLoss
- **Normalization:** StandardScaler

### Model Architecture

- **Input Layer:** 7 features
- **Hidden Layer:** 32 neurons with ReLU activation
- **Output Layer:** 2 neurons (binary classification)
- **Total Parameters:** Approximately 288 parameters

### Quantization

Post-training dynamic quantization was applied using PyTorch's built-in quantization framework to reduce the model size and improve inference efficiency with QInt8 precision.

---

## Repository Structure

```

.

β”œβ”€β”€ quantized_model/

β”‚   β”œβ”€β”€ config.json              # Model configuration

β”‚   β”œβ”€β”€ pytorch_model.bin        # Quantized model weights

β”‚   β”œβ”€β”€ model.safetensors        # Alternative model format

β”‚   β”œβ”€β”€ vocab.txt               # Feature names

β”‚   β”œβ”€β”€ tokenizer_config.json   # Scaler configuration

β”‚   └── special_tokens_map.json # Label encoder metadata

β”œβ”€β”€ gender-classification.ipynb  # Training notebook

└── README.md                   # Model documentation

```

---

## Input Features

The model expects 7 numerical features as input. The exact feature names and preprocessing requirements are stored in the configuration files.

---

## Limitations

- The model is designed for binary gender classification only
- Performance depends on the similarity between inference data and training data distribution
- Quantization may result in minor accuracy changes compared to full-precision models
- Requires proper feature scaling using StandardScaler fitted on training data

---

## Contributing

Contributions are welcome! Feel free to open an issue or PR for improvements, fixes, or feature extensions.

---