Phishing-Detector

Running

App Files Files Community

Phishing-Detector / README.md

th1enq

add bert & xgboost from global

6eb199d 7 days ago

preview code

raw

history blame contribute delete

1.91 kB

	---
	title: Phishing Detector
	emoji: 🔍
	colorFrom: red
	colorTo: blue
	sdk: gradio
	sdk_version: 4.39.0
	app_file: app.py
	pinned: false
	---

	# Phishing Detector

	A comprehensive multi-model phishing detection system using:

	## 🤖 Models
	- DeBERTa + LSTM: Advanced transformer with attention mechanism (`khoa-done/phishing-detector`)
	- BERT: Fine-tuned BERT model (`th1enq/bert_checkpoint`)
	- XGBoost: Traditional ML with feature engineering (`th1enq/xgboost_checkpoint`)

	## ✨ Features
	- URL Structure Analysis: Extract 30+ features from URL patterns
	- HTML Content Analysis: Extract 43+ features from webpage content
	- Combined Predictions: Weighted ensemble of all models
	- Visual Attention Weights: See which tokens influence decisions
	- Real-time Web Scraping: Fetch and analyze live websites
	- Multi-tab Interface: Compare results across different models

	## 🚀 Usage
	1. Enter a URL: System will fetch the webpage and analyze both URL structure and content
	2. Enter text: Direct analysis of suspicious text content
	3. Compare Models: Use different tabs to see how each model performs

	## 📊 Model Performance
	- DeBERTa + LSTM: Best for context understanding with attention visualization
	- BERT: Reliable baseline with robust predictions
	- XGBoost: Fast traditional ML approach with feature interpretability

	## 🔧 Technical Details
	- All models loaded from Hugging Face Hub for easy deployment
	- Feature extraction modules included for XGBoost functionality
	- Dark theme optimized interface with visual analytics
	- Graceful fallbacks if models fail to load

	## 📝 Examples
	Try these URLs to see the system in action:
	- `https://github.com/user/repo` (should be benign)
	- `http://suspicious-phishing-site.example` (simulated phishing)
	- Or paste any suspicious email content for analysis