Spaces:
Sleeping
Sleeping
What is it, who are we, what to do
Browse files
README.md
CHANGED
@@ -1,14 +1,119 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
-
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
short_description: 'The only Secure and Rational Email Phishing Detector '
|
12 |
---
|
13 |
|
14 |
-
|
|
|
1 |
+
|
2 |
+
# EmailGuard: AI-Powered Phishing Detection System
|
3 |
+
|
4 |
+
The only secure and rational email phishing detector using advanced DistilBERT architecture for multilabel classification of emails and URLs.
|
5 |
+
|
6 |
+
## Model Architecture
|
7 |
+
|
8 |
+
**Base Model:** DistilBERT (Distilled Bidirectional Encoder Representations from Transformers)
|
9 |
+
- **Task Type:** Multilabel sequence classification
|
10 |
+
- **Framework:** Hugging Face Transformers
|
11 |
+
- **Fine-tuning:** 3 epochs using Trainer API
|
12 |
+
- **Input Length:** Maximum 512 tokens with truncation
|
13 |
+
- **Output Classes:** 4-class multilabel classification
|
14 |
+
|
15 |
+
## Performance Metrics
|
16 |
+
|
17 |
+
- **Accuracy:** 99.58%
|
18 |
+
- **F1-Score:** 99.579
|
19 |
+
- **Precision:** 99.583
|
20 |
+
- **Recall:** 99.58%
|
21 |
+
|
22 |
+
## Dataset
|
23 |
+
|
24 |
+
Trained on custom dataset `cybersectony/PhishingEmailDetectionv2.0` containing labeled emails and URLs classified as legitimate or phishing attempts.
|
25 |
+
|
26 |
+
## Classification Categories
|
27 |
+
|
28 |
+
1. **Legitimate Email** - Normal email communications
|
29 |
+
2. **Phishing URL** - Malicious web links
|
30 |
+
3. **Legitimate URL** - Safe web links
|
31 |
+
4. **Phishing Email** - Fraudulent email attempts
|
32 |
+
|
33 |
+
## Technical Implementation
|
34 |
+
|
35 |
+
The model uses softmax activation for probability distribution across classes, with the highest probability determining the primary classification. Input preprocessing includes tokenization with padding and truncation to maintain consistent input dimensions.
|
36 |
+
|
37 |
+
## π Getting Started
|
38 |
+
|
39 |
+
### Option 1: Use Online (Recommended)
|
40 |
+
**Try EmailGuard instantly - no installation required!**
|
41 |
+
1. Visit our live demo on Hugging Face Spaces
|
42 |
+
2. Paste your email content or suspicious URL
|
43 |
+
3. Click "Analyze for Phishing"
|
44 |
+
4. Get instant results with confidence scores
|
45 |
+
|
46 |
+
### Option 2: Local Installation
|
47 |
+
```bash
|
48 |
+
# Clone the repository
|
49 |
+
git clone https://huggingface.co/spaces/[your-username]/EmailGuard
|
50 |
+
cd EmailGuard
|
51 |
+
|
52 |
+
# Install dependencies
|
53 |
+
pip install gradio==5.0.1 transformers torch
|
54 |
+
|
55 |
+
# Run locally
|
56 |
+
python app.py
|
57 |
+
```
|
58 |
+
|
59 |
+
## π‘ How to Use EmailGuard
|
60 |
+
|
61 |
+
1. **Input:** Paste suspicious email content, URLs, or text messages
|
62 |
+
2. **Analyze:** Click the analyze button or press Enter
|
63 |
+
3. **Review:** Check the risk assessment and confidence breakdown
|
64 |
+
4. **Verify:** Always cross-check results through official channels
|
65 |
+
|
66 |
+
### Example Inputs to Test:
|
67 |
+
- Suspicious payment verification emails
|
68 |
+
- Unknown links from social media
|
69 |
+
- Urgent account security messages
|
70 |
+
- Prize/lottery notification emails
|
71 |
+
|
72 |
+
## π Suggestions & Best Practices
|
73 |
+
|
74 |
+
**β
Good Use Cases:**
|
75 |
+
- Educational cybersecurity training
|
76 |
+
- Academic research projects
|
77 |
+
- Initial screening of suspicious content
|
78 |
+
- Learning about phishing patterns
|
79 |
+
|
80 |
+
**β οΈ Important Limitations:**
|
81 |
+
- This is a prototype for academic purposes
|
82 |
+
- Not intended for production security systems
|
83 |
+
- Always verify through official channels
|
84 |
+
- Combine with human judgment and expertise
|
85 |
+
|
86 |
+
## π€ Contact & Support
|
87 |
+
|
88 |
+
**Questions? Feedback? Collaboration?**
|
89 |
+
|
90 |
+
π§ **Email:** [email protected]
|
91 |
+
|
92 |
+
We welcome:
|
93 |
+
- Academic collaboration inquiries
|
94 |
+
- Technical feedback and suggestions
|
95 |
+
- Bug reports and improvement ideas
|
96 |
+
- Research partnership opportunities
|
97 |
+
|
98 |
+
## π― Take Action Now!
|
99 |
+
|
100 |
+
**Ready to test EmailGuard?**
|
101 |
+
1. **[Try the Live Demo β]** Start analyzing suspicious emails instantly
|
102 |
+
2. **[Fork on GitHub β]** Contribute to the open-source project
|
103 |
+
3. **[Share with Friends β]** Help others stay safe from phishing
|
104 |
+
|
105 |
+
**Stay Safe Online!** π‘οΈ
|
106 |
+
|
107 |
---
|
108 |
+
|
109 |
+
### Academic Disclaimer
|
110 |
+
|
111 |
+
**Date:** May 30, 2025
|
112 |
+
|
113 |
+
This application is developed as an academic project by University of Dar es Salaam students: _**Byabato, Emmaculata, Regina, Sandy, Gladness, Alvin, Dorcas, and Albert**_.
|
114 |
+
|
115 |
+
**Important Notice:** This tool is intended solely for educational and research purposes. The developers hold no rights, benefits, or responsibilities regarding its use. Users are strongly advised to exercise caution and not rely on this system as a direct security solution. This is a prototype for academic evaluation and should not replace professional cybersecurity tools or expert judgment. Always verify suspicious content through official channels and established security protocols.
|
116 |
+
|
|
|
117 |
---
|
118 |
|
119 |
+
*Built with β€οΈ by University of Dar es Salaam's Computer Science and Engineering (CSE) students*
|