MUFASA25 commited on
Commit
369574e
·
verified ·
1 Parent(s): 837ff05

multimodal

Browse files
Files changed (1) hide show
  1. README.md +93 -90
README.md CHANGED
@@ -1,128 +1,131 @@
1
  ---
2
  license: apache-2.0
3
- title: EmailGuard
4
  sdk: gradio
5
- emoji:
6
- colorFrom: yellow
7
- colorTo: purple
8
  short_description: The only secure and rational email phishing detector
9
  ---
 
10
 
11
- # EmailGuard: AI-Powered Phishing Detection System
12
 
13
- The only secure and rational email phishing detector using advanced DistilBERT architecture for multilabel classification of emails and URLs.
14
 
15
- ## Model Architecture
 
 
 
 
 
16
 
17
- **Base Model:** DistilBERT (Distilled Bidirectional Encoder Representations from Transformers)
18
- - **Task Type:** Multilabel sequence classification
19
- - **Framework:** Hugging Face Transformers
20
- - **Fine-tuning:** 3 epochs using Trainer API
21
- - **Input Length:** Maximum 512 tokens with truncation
22
- - **Output Classes:** 4-class multilabel classification
23
 
24
- ## Performance Metrics
25
-
26
- - **Accuracy:** 99.58%
27
- - **F1-Score:** 99.579
28
- - **Precision:** 99.583
29
- - **Recall:** 99.58%
30
 
31
- ## Dataset
 
 
 
32
 
33
- Trained on custom dataset `cybersectony/PhishingEmailDetectionv2.0` containing labeled emails and URLs classified as legitimate or phishing attempts.
 
 
 
34
 
35
- ## Classification Categories
36
 
37
- 1. **Legitimate Email** - Normal email communications
38
- 2. **Phishing URL** - Malicious web links
39
- 3. **Legitimate URL** - Safe web links
40
- 4. **Phishing Email** - Fraudulent email attempts
41
 
42
- ## Technical Implementation
 
 
43
 
44
- The model uses softmax activation for probability distribution across classes, with the highest probability determining the primary classification. Input preprocessing includes tokenization with padding and truncation to maintain consistent input dimensions.
45
 
46
- ## 🚀 Getting Started
 
 
47
 
48
- ### Option 1: Use Online (Recommended)
49
- **Try EmailGuard instantly - no installation required!**
50
- 1. Visit our live demo on Hugging Face Spaces
51
- 2. Paste your email content or suspicious URL
52
- 3. Click "Analyze for Phishing"
53
- 4. Get instant results with confidence scores
54
 
55
- ### Option 2: Local Installation
56
- ```bash
57
- # Clone the repository
58
- git clone https://huggingface.co/spaces/[your-username]/EmailGuard
59
- cd EmailGuard
60
 
61
- # Install dependencies
62
- pip install gradio==5.0.1 transformers torch
 
 
 
 
63
 
64
- # Run locally
65
- python app.py
66
- ```
 
67
 
68
- ## 💡 How to Use EmailGuard
69
 
70
- 1. **Input:** Paste suspicious email content, URLs, or text messages
71
- 2. **Analyze:** Click the analyze button or press Enter
72
- 3. **Review:** Check the risk assessment and confidence breakdown
73
- 4. **Verify:** Always cross-check results through official channels
74
 
75
- ### Example Inputs to Test:
76
- - Suspicious payment verification emails
77
- - Unknown links from social media
78
- - Urgent account security messages
79
- - Prize/lottery notification emails
80
 
81
- ## 📋 Suggestions & Best Practices
 
 
 
 
 
82
 
83
- **✅ Good Use Cases:**
84
- - Educational cybersecurity training
85
- - Academic research projects
86
- - Initial screening of suspicious content
87
- - Learning about phishing patterns
88
 
89
- **⚠️ Important Limitations:**
90
- - This is a prototype for academic purposes
91
- - Not intended for production security systems
92
- - Always verify through official channels
93
- - Combine with human judgment and expertise
94
-
95
- ## 🤝 Contact & Support
96
-
97
- **Questions? Feedback? Collaboration?**
98
-
99
- 📧 **Email:** [email protected]
100
 
101
- We welcome:
102
- - Academic collaboration inquiries
103
- - Technical feedback and suggestions
104
- - Bug reports and improvement ideas
105
- - Research partnership opportunities
106
 
107
- ## 🎯 Take Action Now!
108
 
109
- **Ready to test EmailGuard?**
110
- 1. **[Try the Live Demo →]** Start analyzing suspicious emails instantly
111
- 2. **[Fork on GitHub →]** Contribute to the open-source project
112
- 3. **[Share with Friends →]** Help others stay safe from phishing
 
 
 
113
 
114
- **Stay Safe Online!** 🛡️
115
 
116
- ---
 
 
 
117
 
118
- ### Academic Disclaimer
119
 
120
- **Date:** May 30, 2025
121
 
122
- This application is developed as an academic project by University of Dar es Salaam students: _**Byabato, Emmaculata, Regina, Sandy, Gladness, Alvin, Dorcas, and Albert**_.
123
 
124
- **Important Notice:** This tool is intended solely for educational and research purposes. The developers hold no rights, benefits, or responsibilities regarding its use. Users are strongly advised to exercise caution and not rely on this system as a direct security solution. This is a prototype for academic evaluation and should not replace professional cybersecurity tools or expert judgment. Always verify suspicious content through official channels and established security protocols.
 
 
 
125
 
126
- ---
127
 
128
- *Built with ❤️ by University of Dar es Salaam's Computer Science and Engineering (CSE) students*
 
1
  ---
2
  license: apache-2.0
3
+ title: EmailGuard2
4
  sdk: gradio
5
+ emoji: 🌍
6
+ colorFrom: blue
7
+ colorTo: pink
8
  short_description: The only secure and rational email phishing detector
9
  ---
10
+ # EmailGuard2 : Advanced Phishing Detection System
11
 
12
+ A multi-model ensemble system for detecting phishing attempts in emails, URLs, and text messages using AI and feature engineering.
13
 
14
+ ## Features
15
 
16
+ - Multi-model ensemble prediction
17
+ - Advanced feature extraction and analysis
18
+ - Real-time phishing detection
19
+ - Web-based user interface
20
+ - Risk scoring and confidence reporting
21
+ - URL and email content analysis
22
 
23
+ ## Installation
 
 
 
 
 
24
 
25
+ 1. Clone the repository:
26
+ ```bash
27
+ git clone <repository-url>
28
+ cd emailguard-phishing-detection
29
+ ```
 
30
 
31
+ 2. Install dependencies:
32
+ ```bash
33
+ pip install -r requirements.txt
34
+ ```
35
 
36
+ 3. Run the application:
37
+ ```bash
38
+ python app.py
39
+ ```
40
 
41
+ 4. Open your browser and go to `http://localhost:7860`
42
 
43
+ ## Usage
 
 
 
44
 
45
+ 1. Enter email content, URL, or suspicious text in the input field
46
+ 2. Click "Advanced Analysis" to process the input
47
+ 3. Review the results including risk level and confidence scores
48
 
49
+ ## Models Used
50
 
51
+ - Primary: `cybersectony/phishing-email-detection-distilbert_v2.4.1`
52
+ - URL Specialist: Custom URL analysis model
53
+ - Feature Engine: Hand-crafted pattern detection rules
54
 
55
+ ## Detection Features
 
 
 
 
 
56
 
57
+ ### URL Analysis
58
+ - Suspicious domain detection
59
+ - Shortened URL identification
60
+ - Malicious link patterns
 
61
 
62
+ ### Content Analysis
63
+ - Urgency keyword detection
64
+ - Money-related terms
65
+ - Personal information requests
66
+ - Spelling error patterns
67
+ - Excessive capitalization
68
 
69
+ ### Risk Assessment
70
+ - HIGH RISK: Strong phishing indicators (>60% confidence)
71
+ - MEDIUM RISK: Suspicious patterns (30-60% confidence)
72
+ - LOW RISK: Appears legitimate (<30% confidence)
73
 
74
+ ## System Requirements
75
 
76
+ - Python 3.8+
77
+ - 4GB+ RAM
78
+ - Internet connection (for initial model download)
 
79
 
80
+ ## Technical Details
 
 
 
 
81
 
82
+ The system uses:
83
+ - PyTorch for deep learning models
84
+ - Transformers for NLP processing
85
+ - Gradio for web interface
86
+ - Custom ensemble voting mechanism
87
+ - Feature-based risk adjustment
88
 
89
+ ## Example Inputs
 
 
 
 
90
 
91
+ **Phishing Example:**
92
+ ```
93
+ URGENT: Your PayPal account has been limited! Verify immediately at http://paypal-security-check.suspicious.com/verify
94
+ ```
 
 
 
 
 
 
 
95
 
96
+ **Legitimate Example:**
97
+ ```
98
+ Hi Sarah, Thanks for the quarterly report. Let's discuss in tomorrow's meeting. Best, Mike
99
+ ```
 
100
 
101
+ ## Configuration
102
 
103
+ Model configuration in `app.py`:
104
+ ```python
105
+ MODELS = {
106
+ "primary": "cybersectony/phishing-email-detection-distilbert_v2.4.1",
107
+ "url_specialist": "cybersectony/phishing-email-detection-distilbert_v2.4.1"
108
+ }
109
+ ```
110
 
111
+ ## Limitations
112
 
113
+ - This is an educational/research tool
114
+ - Always verify suspicious content through official channels
115
+ - May produce false positives/negatives
116
+ - Requires manual verification for critical decisions
117
 
118
+ ## License
119
 
120
+ Apache2.0 License
121
 
122
+ ## Contributing
123
 
124
+ 1. Fork the repository
125
+ 2. Create a feature branch
126
+ 3. Make your changes
127
+ 4. Submit a pull request
128
 
129
+ ## Support
130
 
131
+ For issues and questions, please use the GitHub issue tracker.