selfconstruct3d commited on
Commit
a967b13
·
verified ·
1 Parent(s): 1fe6cc1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -1
README.md CHANGED
@@ -9,6 +9,11 @@ language:
9
 
10
  This repository hosts a lightweight `scikit-learn`-based MLP classifier trained to distinguish cybersecurity-related content from other text, using sentence-transformer embeddings. It supports English and German input texts.
11
 
 
 
 
 
 
12
  ## 📦 Model Details
13
 
14
  - **Architecture**: `MLPClassifier` with hidden layers `(128, 64)`
@@ -52,7 +57,7 @@ X_train_emb = embedder.encode(X_train.tolist(), convert_to_numpy=True, show_prog
52
  X_test_emb = embedder.encode(X_test.tolist(), convert_to_numpy=True, show_progress_bar=True)
53
 
54
  # Load the trained classifier
55
- model_path = hf_hub_download(repo_id="selfconstruct3d/cybersec-classifier", filename="cybersec_classifier.pkl")
56
  model = joblib.load(model_path)
57
 
58
  # Predict
 
9
 
10
  This repository hosts a lightweight `scikit-learn`-based MLP classifier trained to distinguish cybersecurity-related content from other text, using sentence-transformer embeddings. It supports English and German input texts.
11
 
12
+ ## 📊 Training Data
13
+
14
+ The model was trained on a multilingual dataset of cybersecurity and non-cybersecurity news articles. The dataset is publicly available on Zenodo:
15
+ 🔗 [https://zenodo.org/records/16417939](https://zenodo.org/records/16417939)
16
+
17
  ## 📦 Model Details
18
 
19
  - **Architecture**: `MLPClassifier` with hidden layers `(128, 64)`
 
57
  X_test_emb = embedder.encode(X_test.tolist(), convert_to_numpy=True, show_progress_bar=True)
58
 
59
  # Load the trained classifier
60
+ model_path = hf_hub_download(repo_id="selfconstruct3d/cybersec_classifier", filename="cybersec_classifier.pkl")
61
  model = joblib.load(model_path)
62
 
63
  # Predict