Spaces:
Runtime error
Runtime error
Update README.md
Browse files
README.md
CHANGED
@@ -1,131 +1,86 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
license: apache-2.0
|
11 |
-
short_description: UDSM AI-powered tool for real-time phishing email detection.
|
12 |
-
---
|
13 |
-
Phishing Email Detection Space
|
14 |
-
Welcome to the Phishing Email Detection Hugging Face Space! This project provides an interactive web interface to classify emails as legitimate or phishing using a fine-tuned DistilBERT model (cybersectony/phishing-email-detection-distilbert_v2.4.1). Built with Gradio, this Space allows users to input email text and receive predictions with confidence scores and probability distributions.
|
15 |
-
Table of Contents
|
16 |
-
|
17 |
-
Overview
|
18 |
-
Features
|
19 |
-
How It Works
|
20 |
-
Usage
|
21 |
-
Installation (For Local Development)
|
22 |
-
Model Details
|
23 |
-
Contributing
|
24 |
-
License
|
25 |
-
Contact
|
26 |
-
|
27 |
-
Overview
|
28 |
-
This Space deploys a DistilBERT-based model to detect phishing emails by classifying input text into one of four categories: Legitimate Email, Phishing URL, Legitimate URL, or Phishing URL (Alt). The model is hosted on Hugging Face and integrated with a Gradio interface for easy interaction. Users can input email text and instantly view the predicted classification along with confidence scores.
|
29 |
-
Features
|
30 |
-
|
31 |
-
Interactive Interface: Input email text via a user-friendly Gradio web interface.
|
32 |
-
Real-Time Predictions: Get immediate classification results with confidence scores.
|
33 |
-
Detailed Output: View probabilities for all classes (Legitimate Email, Phishing URL, Legitimate URL, Phishing URL Alt).
|
34 |
-
Lightweight Model: Uses DistilBERT for efficient inference, suitable for CPU-based environments.
|
35 |
-
Open Source: Code and model are accessible for further customization.
|
36 |
-
|
37 |
-
How It Works
|
38 |
-
|
39 |
-
The user inputs email text into the Gradio interface.
|
40 |
-
The text is tokenized using the DistilBERT tokenizer.
|
41 |
-
The fine-tuned DistilBERT model processes the input and outputs probabilities for each class.
|
42 |
-
The interface displays the most likely classification, confidence score, and all class probabilities.
|
43 |
-
|
44 |
-
Usage
|
45 |
-
|
46 |
-
Access the Space: Visit the Hugging Face Space URL (e.g., https://<your-username>-<space-name>.hf.space).
|
47 |
-
Enter Email Text: Type or paste the email content into the provided text box.
|
48 |
-
Get Prediction: Click the "Submit" button to view the classification results.
|
49 |
-
Interpret Results: The output includes:
|
50 |
-
Prediction: The most likely class (e.g., "Phishing URL").
|
51 |
-
Confidence: The probability score for the predicted class.
|
52 |
-
All Probabilities: Probability scores for all four classes.
|
53 |
-
|
54 |
-
|
55 |
-
|
56 |
-
Example Input:
|
57 |
-
Subject: Urgent: Verify Your Account Now
|
58 |
-
Dear Customer, your account has been flagged. Click here to verify: [suspicious-link.com].
|
59 |
-
|
60 |
-
Example Output:
|
61 |
-
Prediction: Phishing URL
|
62 |
-
Confidence: 0.9278
|
63 |
-
All Probabilities:
|
64 |
-
- Legitimate Email: 0.0123
|
65 |
-
- Phishing URL: 0.9278
|
66 |
-
- Legitimate URL: 0.0345
|
67 |
-
- Phishing URL (Alt): 0.0254
|
68 |
-
|
69 |
-
Installation (For Local Development)
|
70 |
-
If you want to run this project locally or contribute to its development, follow these steps:
|
71 |
-
|
72 |
-
Clone the Repository:
|
73 |
-
git clone https://huggingface.co/spaces/<your-username>/<your-space-name>
|
74 |
-
cd <your-space-name>
|
75 |
-
|
76 |
-
|
77 |
-
Install Dependencies:Create a virtual environment and install the required packages:
|
78 |
-
python -m venv venv
|
79 |
-
source venv/bin/activate # On Windows: venv\Scripts\activate
|
80 |
-
pip install -r requirements.txt
|
81 |
|
|
|
|
|
|
|
|
|
|
|
82 |
|
83 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
84 |
python app.py
|
|
|
85 |
|
|
|
86 |
|
87 |
-
|
|
|
|
|
|
|
88 |
|
|
|
89 |
|
90 |
-
|
91 |
-
transformers
|
92 |
-
torch
|
93 |
-
gradio
|
94 |
|
95 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
96 |
|
97 |
-
|
98 |
-
Architecture: DistilBERT (fine-tuned for sequence classification)
|
99 |
-
Classes:
|
100 |
-
Legitimate Email
|
101 |
-
Phishing URL
|
102 |
-
Legitimate URL
|
103 |
-
Phishing URL (Alt)
|
104 |
|
|
|
|
|
|
|
|
|
|
|
105 |
|
106 |
-
|
107 |
-
Output: Probabilities for each class, with the highest probability determining the predicted class.
|
108 |
|
109 |
-
|
110 |
-
|
111 |
-
|
|
|
|
|
112 |
|
113 |
-
|
114 |
-
Create a new branch for your changes (git checkout -b feature/your-feature).
|
115 |
-
Commit your changes (git commit -m "Add your feature").
|
116 |
-
Push to your fork (git push origin feature/your-feature).
|
117 |
-
Open a pull request on the Space’s repository.
|
118 |
|
119 |
-
|
|
|
|
|
|
|
|
|
120 |
|
121 |
-
License
|
122 |
-
This project is licensed under the APACHE 2.0. See the LICENSE file for details.
|
123 |
|
124 |
-
|
125 |
-
For questions or feedback, please reach out via:
|
126 |
|
127 |
-
|
128 |
-
|
129 |
-
|
|
|
|
|
|
|
|
|
|
|
130 |
|
131 |
-
|
|
|
1 |
+
# PhishGuardian AI 🛡️
|
2 |
+
|
3 |
+
AI-powered phishing email detection using DistilBERT for real-time security analysis.
|
4 |
+
|
5 |
+
## Overview
|
6 |
+
|
7 |
+
PhishGuardian AI is an intelligent email security tool that classifies emails as legitimate or phishing using a fine-tuned DistilBERT model. Built for the University of Dar es Salaam (UDSM) community, it provides instant threat assessment through an intuitive web interface.
|
8 |
+
|
9 |
+
## Features
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
10 |
|
11 |
+
- **Real-time Detection**: Instant email classification with confidence scoring
|
12 |
+
- **Advanced AI Model**: Fine-tuned DistilBERT (`cybersectony/phishing-email-detection-distilbert_v2.4.1`)
|
13 |
+
- **User-friendly Interface**: Clean Gradio web interface with visual risk indicators
|
14 |
+
- **Comprehensive Analysis**: Detailed probability breakdown for all threat categories
|
15 |
+
- **Educational Tool**: Built-in examples and security recommendations
|
16 |
|
17 |
+
## Quick Start
|
18 |
+
|
19 |
+
### Online Access
|
20 |
+
Visit the deployed Space: `https://huggingface.co/spaces/MUFASA25/phishguardian-ai`
|
21 |
+
|
22 |
+
### Local Development
|
23 |
+
```bash
|
24 |
+
git clone https://huggingface.co/spaces/MUFASA25/phishguardian-ai
|
25 |
+
cd phishguardian-ai
|
26 |
+
pip install -r requirements.txt
|
27 |
python app.py
|
28 |
+
```
|
29 |
|
30 |
+
## Usage
|
31 |
|
32 |
+
1. **Input**: Paste email content into the text area
|
33 |
+
2. **Analyze**: Click "Analyze Email" for instant results
|
34 |
+
3. **Review**: Examine risk level, confidence score, and detailed analysis
|
35 |
+
4. **Act**: Follow provided security recommendations
|
36 |
|
37 |
+
### Example Analysis
|
38 |
|
39 |
+
**Input**: Suspicious email with urgent account verification request
|
|
|
|
|
|
|
40 |
|
41 |
+
**Output**:
|
42 |
+
```
|
43 |
+
🚨 HIGH RISK
|
44 |
+
Primary Classification: Phishing Email
|
45 |
+
Confidence: 92.8%
|
46 |
+
Recommendation: Do not click any links or provide personal information.
|
47 |
+
```
|
48 |
|
49 |
+
## Technical Specifications
|
|
|
|
|
|
|
|
|
|
|
|
|
50 |
|
51 |
+
- **Model**: DistilBERT-base fine-tuned for sequence classification
|
52 |
+
- **Input Limit**: 512 tokens
|
53 |
+
- **Classes**: Legitimate Email, Phishing Email, Suspicious Content, Other
|
54 |
+
- **Framework**: Transformers, PyTorch, Gradio
|
55 |
+
- **Deployment**: Hugging Face Spaces
|
56 |
|
57 |
+
## Requirements
|
|
|
58 |
|
59 |
+
```
|
60 |
+
gradio>=4.0.0
|
61 |
+
transformers>=4.21.0
|
62 |
+
torch>=1.12.0
|
63 |
+
```
|
64 |
|
65 |
+
## Contributing
|
|
|
|
|
|
|
|
|
66 |
|
67 |
+
1. Fork the repository
|
68 |
+
2. Create feature branch (`git checkout -b feature/enhancement`)
|
69 |
+
3. Commit changes (`git commit -m 'Add enhancement'`)
|
70 |
+
4. Push to branch (`git push origin feature/enhancement`)
|
71 |
+
5. Open Pull Request
|
72 |
|
73 |
+
## License
|
|
|
74 |
|
75 |
+
Licensed under Apache 2.0. See [LICENSE](LICENSE) for details.
|
|
|
76 |
|
77 |
+
## Contact
|
78 |
+
|
79 |
+
**Developer**: MUFASA25
|
80 |
+
**Email**: [email protected]
|
81 |
+
**Institution**: University of Dar es Salaam (UDSM)
|
82 |
+
**Profile**: [https://huggingface.co/MUFASA25](https://huggingface.co/MUFASA25)
|
83 |
+
|
84 |
+
---
|
85 |
|
86 |
+
⚠️ **Disclaimer**: This tool is for educational and awareness purposes. Always follow your organization's security protocols and use professional judgment when handling suspicious emails.
|