Spaces:

Duplicated from khoa-done/Phishing-Detector

th1enq
/

Phishing-Detector

Running

App Files Files Community

Phishing-Detector / README.md

th1enq's picture

add bert & xgboost from global

6eb199d 7 days ago

|

history blame contribute delete

1.91 kB

A newer version of the Gradio SDK is available: 5.42.0

Upgrade

metadata

title: Phishing Detector
emoji: 🔍
colorFrom: red
colorTo: blue
sdk: gradio
sdk_version: 4.39.0
app_file: app.py
pinned: false

Phishing Detector

A comprehensive multi-model phishing detection system using:

🤖 Models

DeBERTa + LSTM: Advanced transformer with attention mechanism (khoa-done/phishing-detector)
BERT: Fine-tuned BERT model (th1enq/bert_checkpoint)
XGBoost: Traditional ML with feature engineering (th1enq/xgboost_checkpoint)

✨ Features

URL Structure Analysis: Extract 30+ features from URL patterns
HTML Content Analysis: Extract 43+ features from webpage content
Combined Predictions: Weighted ensemble of all models
Visual Attention Weights: See which tokens influence decisions
Real-time Web Scraping: Fetch and analyze live websites
Multi-tab Interface: Compare results across different models

🚀 Usage

Enter a URL: System will fetch the webpage and analyze both URL structure and content
Enter text: Direct analysis of suspicious text content
Compare Models: Use different tabs to see how each model performs

📊 Model Performance

DeBERTa + LSTM: Best for context understanding with attention visualization
BERT: Reliable baseline with robust predictions
XGBoost: Fast traditional ML approach with feature interpretability

🔧 Technical Details

All models loaded from Hugging Face Hub for easy deployment
Feature extraction modules included for XGBoost functionality
Dark theme optimized interface with visual analytics
Graceful fallbacks if models fail to load

📝 Examples

Try these URLs to see the system in action:

https://github.com/user/repo (should be benign)
http://suspicious-phishing-site.example (simulated phishing)
Or paste any suspicious email content for analysis