shukdevdatta123 commited on
Commit
47a0996
·
verified ·
1 Parent(s): e48b962

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +93 -91
README.md CHANGED
@@ -1,34 +1,31 @@
1
  ---
2
  base_model: unsloth/deepseek-r1-distill-llama-8b-unsloth-bnb-4bit
3
  library_name: peft
 
 
 
4
  ---
5
 
6
- # Model Card for Model ID
7
-
8
- <!-- Provide a quick summary of what the model is/does. -->
9
-
10
 
 
11
 
12
  ## Model Details
13
 
14
  ### Model Description
15
 
16
- <!-- Provide a longer summary of what this model is. -->
17
-
18
-
19
 
20
  - **Developed by:** [More Information Needed]
21
  - **Funded by [optional]:** [More Information Needed]
22
  - **Shared by [optional]:** [More Information Needed]
23
- - **Model type:** [More Information Needed]
24
- - **Language(s) (NLP):** [More Information Needed]
25
  - **License:** [More Information Needed]
26
  - **Finetuned from model [optional]:** [More Information Needed]
27
 
28
  ### Model Sources [optional]
29
 
30
- <!-- Provide the basic links for the model. -->
31
-
32
  - **Repository:** [More Information Needed]
33
  - **Paper [optional]:** [More Information Needed]
34
  - **Demo [optional]:** [More Information Needed]
@@ -88,146 +85,151 @@ This model should not be used for malicious purposes, such as testing vulnerabil
88
 
89
  ## Bias, Risks, and Limitations
90
 
91
- <!-- This section is meant to convey both technical and sociotechnical limitations. -->
92
 
93
- [More Information Needed]
 
 
94
 
95
  ### Recommendations
96
 
97
- <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
98
-
99
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
100
 
101
  ## How to Get Started with the Model
102
 
103
- Use the code below to get started with the model.
 
 
104
 
105
- [More Information Needed]
 
 
106
 
107
- ## Training Details
 
 
 
 
108
 
109
- ### Training Data
 
 
 
110
 
111
- <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
 
112
 
113
- [More Information Needed]
 
 
 
 
 
 
 
 
114
 
115
- ### Training Procedure
 
 
 
 
 
116
 
117
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
118
 
119
- #### Preprocessing [optional]
120
 
121
- [More Information Needed]
122
 
 
123
 
124
  #### Training Hyperparameters
125
 
126
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
127
-
128
- #### Speeds, Sizes, Times [optional]
129
-
130
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
131
-
132
- [More Information Needed]
133
 
134
  ## Evaluation
135
 
136
- <!-- This section describes the evaluation protocols and provides the results. -->
137
-
138
  ### Testing Data, Factors & Metrics
139
 
140
  #### Testing Data
141
 
142
- <!-- This should link to a Dataset Card if possible. -->
143
-
144
- [More Information Needed]
145
-
146
- #### Factors
147
-
148
- <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
149
-
150
- [More Information Needed]
151
 
152
  #### Metrics
153
 
154
- <!-- These are the evaluation metrics being used, ideally with a description of why. -->
155
-
156
- [More Information Needed]
157
 
158
  ### Results
159
 
160
- [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
161
 
162
  #### Summary
163
 
164
-
165
-
166
- ## Model Examination [optional]
167
-
168
- <!-- Relevant interpretability work for the model goes here -->
169
-
170
- [More Information Needed]
171
-
172
- ## Environmental Impact
173
-
174
- <!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
175
-
176
- Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
177
-
178
- - **Hardware Type:** [More Information Needed]
179
- - **Hours used:** [More Information Needed]
180
- - **Cloud Provider:** [More Information Needed]
181
- - **Compute Region:** [More Information Needed]
182
- - **Carbon Emitted:** [More Information Needed]
183
 
184
  ## Technical Specifications [optional]
185
 
186
  ### Model Architecture and Objective
187
 
188
- [More Information Needed]
189
 
190
  ### Compute Infrastructure
191
 
192
- [More Information Needed]
193
 
194
  #### Hardware
195
 
196
- [More Information Needed]
197
 
198
  #### Software
199
 
200
- [More Information Needed]
 
201
 
202
- ## Citation [optional]
203
-
204
- <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
205
-
206
- **BibTeX:**
207
-
208
- [More Information Needed]
209
-
210
- **APA:**
211
-
212
- [More Information Needed]
213
 
214
  ## Glossary [optional]
215
 
216
- <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
217
-
218
- [More Information Needed]
219
-
220
- ## More Information [optional]
221
-
222
- [More Information Needed]
223
 
224
  ## Model Card Authors [optional]
225
 
226
- [More Information Needed]
227
 
228
  ## Model Card Contact
229
 
230
- [More Information Needed]
 
 
 
231
  ### Framework versions
232
 
233
  - PEFT 0.14.0
 
1
  ---
2
  base_model: unsloth/deepseek-r1-distill-llama-8b-unsloth-bnb-4bit
3
  library_name: peft
4
+ license: mit
5
+ language:
6
+ - en
7
  ---
8
 
9
+ # Model Card for SQL Injection Classifier
 
 
 
10
 
11
+ This model is designed to classify SQL queries as either normal (0) or as potential SQL injection attacks (1).
12
 
13
  ## Model Details
14
 
15
  ### Model Description
16
 
17
+ This model is trained to identify SQL injection attacks, which are a type of code injection technique where an attacker can execute arbitrary SQL code in a database query. By analyzing the structure of SQL queries, the model predicts whether a given query is a normal query or contains malicious code indicative of an SQL injection attack.
 
 
18
 
19
  - **Developed by:** [More Information Needed]
20
  - **Funded by [optional]:** [More Information Needed]
21
  - **Shared by [optional]:** [More Information Needed]
22
+ - **Model type:** Fine-tuned Llama 8B model (Distilled Version)
23
+ - **Language(s) (NLP):** English
24
  - **License:** [More Information Needed]
25
  - **Finetuned from model [optional]:** [More Information Needed]
26
 
27
  ### Model Sources [optional]
28
 
 
 
29
  - **Repository:** [More Information Needed]
30
  - **Paper [optional]:** [More Information Needed]
31
  - **Demo [optional]:** [More Information Needed]
 
85
 
86
  ## Bias, Risks, and Limitations
87
 
88
+ This model was trained on a dataset of SQL queries and may exhibit certain limitations:
89
 
90
+ - **Bias**: The model may have limited generalization across different types of SQL injections or databases outside those present in the training set.
91
+ - **Risks**: False positives or false negatives could lead to missed SQL injection attacks or incorrect identification of normal queries as injections.
92
+ - **Limitations**: The model may not perform well on highly obfuscated attacks or queries that exploit novel vulnerabilities not present in the training data.
93
 
94
  ### Recommendations
95
 
96
+ Users (both direct and downstream) should be aware of the potential risks of relying on the model in security-sensitive applications. Additional domain-specific testing and validation are recommended before deployment.
 
 
97
 
98
  ## How to Get Started with the Model
99
 
100
+ ```python
101
+ from unsloth import FastLanguageModel
102
+ from transformers import AutoTokenizer
103
 
104
+ # Load the model and tokenizer
105
+ model_name = "shukdevdatta123/sql_injection_classifier_DeepSeek_R1_fine_tuned_model"
106
+ hf_token = "your hf tokens"
107
 
108
+ model, tokenizer = FastLanguageModel.from_pretrained(
109
+ model_name=model_name,
110
+ load_in_4bit=True,
111
+ token=hf_token,
112
+ )
113
 
114
+ # Function for testing queries
115
+ def predict_sql_injection(query):
116
+ # Prepare the model for inference
117
+ inference_model = FastLanguageModel.for_inference(model)
118
 
119
+ prompt = f"### Instruction:\nClassify the following SQL query as normal (0) or an injection attack (1).\n\n### Query:\n{query}\n\n### Classification:\n"
120
+ inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
121
 
122
+ # Use the inference model for generation
123
+ outputs = inference_model.generate(
124
+ input_ids=inputs.input_ids,
125
+ attention_mask=inputs.attention_mask,
126
+ max_new_tokens=1000,
127
+ use_cache=True,
128
+ )
129
+ prediction = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
130
+ return prediction.split("### Classification:\n")[-1].strip()
131
 
132
+ # Example usage
133
+ test_query = "SELECT * FROM users WHERE id = '1' OR '1'='1' --"
134
+ result = predict_sql_injection(test_query)
135
+ print(f"Query: {test_query}\nPrediction: {result}")
136
+ ```
137
+ ## Training Details
138
 
139
+ ### Training Data
140
 
141
+ The model was trained using a dataset of SQL queries, specifically focusing on SQL injection examples and normal queries. Each query is labeled as either normal (0) or an injection (1).
142
 
143
+ ### Training Procedure
144
 
145
+ The model was fine-tuned using the PEFT (Parameter Efficient Fine-Tuning) technique, optimizing a pre-trained Llama 8B model for the task of SQL injection detection.
146
 
147
  #### Training Hyperparameters
148
 
149
+ - **Training regime:** Mixed precision (fp16).
150
+ - **Learning rate:** 2e-4.
151
+ - **Batch size:** 2 per device, with gradient accumulation steps of 4.
152
+ - **Max steps:** 200.
 
 
 
153
 
154
  ## Evaluation
155
 
 
 
156
  ### Testing Data, Factors & Metrics
157
 
158
  #### Testing Data
159
 
160
+ The evaluation was performed on a separate set of labeled SQL queries designed to test the model’s ability to differentiate between normal queries and SQL injection attacks.
 
 
 
 
 
 
 
 
161
 
162
  #### Metrics
163
 
164
+ - **Accuracy:** How accurately the model classifies the queries.
165
+ - **Precision and Recall:** Evaluating the model’s performance in detecting both true positives (injection attacks) and avoiding false positives.
 
166
 
167
  ### Results
168
 
169
+ The model was evaluated based on the training loss across 200 steps. Below is the training loss progression during the training process:
170
+
171
+ | Step | Training Loss |
172
+ |------|---------------|
173
+ | 10 | 2.951600 |
174
+ | 20 | 1.572900 |
175
+ | 30 | 1.370200 |
176
+ | 40 | 1.081900 |
177
+ | 50 | 0.946200 |
178
+ | 60 | 1.028700 |
179
+ | 70 | 0.873700 |
180
+ | 80 | 0.793300 |
181
+ | 90 | 0.892700 |
182
+ | 100 | 0.863000 |
183
+ | 110 | 0.694700 |
184
+ | 120 | 0.685900 |
185
+ | 130 | 0.778400 |
186
+ | 140 | 0.748500 |
187
+ | 150 | 0.721600 |
188
+ | 160 | 0.714400 |
189
+ | 170 | 0.764900 |
190
+ | 180 | 0.750800 |
191
+ | 190 | 0.664200 |
192
+ | 200 | 0.700600 |
193
 
194
  #### Summary
195
 
196
+ The model performs well in identifying common forms of SQL injection but may not handle all edge cases or complex attack patterns. The model shows a significant reduction in training loss over the first 100 steps, indicating good convergence during the fine-tuning process. After step 100, the training loss becomes more stable but continues to fluctuate slightly. Overall, the model achieved a low loss by the final training step, suggesting effective learning and adaptation to the task of classifying SQL injections.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
197
 
198
  ## Technical Specifications [optional]
199
 
200
  ### Model Architecture and Objective
201
 
202
+ The model is based on a fine-tuned Llama 8B architecture, utilizing the PEFT technique to reduce the number of parameters required for fine-tuning while still maintaining good performance.
203
 
204
  ### Compute Infrastructure
205
 
206
+ The model was trained using a powerful GPU cluster, leveraging mixed precision and gradient accumulation for optimal performance on large datasets.
207
 
208
  #### Hardware
209
 
210
+ T4 GPU of Colab
211
 
212
  #### Software
213
 
214
+ - **Libraries:** Hugging Face Transformers, unsloth, TRL, PyTorch.
215
+ - **Training Framework:** PEFT.
216
 
 
 
 
 
 
 
 
 
 
 
 
217
 
218
  ## Glossary [optional]
219
 
220
+ - **SQL Injection**: A type of attack where malicious SQL statements are executed in an application’s database.
221
+ - **PEFT**: Parameter Efficient Fine-Tuning, a technique used for fine-tuning large models with fewer parameters.
 
 
 
 
 
222
 
223
  ## Model Card Authors [optional]
224
 
225
+ Shukdev Datta
226
 
227
  ## Model Card Contact
228
 
229
+ - **Email**: [email protected]
230
+ - **GitHub**: [Click to here to access the Github Profile](https://github.com/shukdevtroy)
231
+ - **WhatsApp**: [Click here to chat](https://wa.me/+8801719296601)
232
+ -
233
  ### Framework versions
234
 
235
  - PEFT 0.14.0