File size: 1,912 Bytes
6eb199d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
---

title: Phishing Detector
emoji: πŸ”
colorFrom: red
colorTo: blue
sdk: gradio
sdk_version: 4.39.0
app_file: app.py
pinned: false
---


# Phishing Detector

A comprehensive multi-model phishing detection system using:

## πŸ€– Models
- **DeBERTa + LSTM**: Advanced transformer with attention mechanism (`khoa-done/phishing-detector`)
- **BERT**: Fine-tuned BERT model (`th1enq/bert_checkpoint`)  
- **XGBoost**: Traditional ML with feature engineering (`th1enq/xgboost_checkpoint`)

## ✨ Features
- **URL Structure Analysis**: Extract 30+ features from URL patterns
- **HTML Content Analysis**: Extract 43+ features from webpage content
- **Combined Predictions**: Weighted ensemble of all models
- **Visual Attention Weights**: See which tokens influence decisions
- **Real-time Web Scraping**: Fetch and analyze live websites
- **Multi-tab Interface**: Compare results across different models

## πŸš€ Usage
1. **Enter a URL**: System will fetch the webpage and analyze both URL structure and content
2. **Enter text**: Direct analysis of suspicious text content
3. **Compare Models**: Use different tabs to see how each model performs

## πŸ“Š Model Performance
- **DeBERTa + LSTM**: Best for context understanding with attention visualization
- **BERT**: Reliable baseline with robust predictions
- **XGBoost**: Fast traditional ML approach with feature interpretability

## πŸ”§ Technical Details
- All models loaded from Hugging Face Hub for easy deployment
- Feature extraction modules included for XGBoost functionality
- Dark theme optimized interface with visual analytics
- Graceful fallbacks if models fail to load

## πŸ“ Examples
Try these URLs to see the system in action:
- `https://github.com/user/repo` (should be benign)
- `http://suspicious-phishing-site.example` (simulated phishing)
- Or paste any suspicious email content for analysis