Phishing-Detector / README.md
th1enq's picture
add bert & xgboost from global
6eb199d

A newer version of the Gradio SDK is available: 5.42.0

Upgrade
metadata
title: Phishing Detector
emoji: πŸ”
colorFrom: red
colorTo: blue
sdk: gradio
sdk_version: 4.39.0
app_file: app.py
pinned: false

Phishing Detector

A comprehensive multi-model phishing detection system using:

πŸ€– Models

  • DeBERTa + LSTM: Advanced transformer with attention mechanism (khoa-done/phishing-detector)
  • BERT: Fine-tuned BERT model (th1enq/bert_checkpoint)
  • XGBoost: Traditional ML with feature engineering (th1enq/xgboost_checkpoint)

✨ Features

  • URL Structure Analysis: Extract 30+ features from URL patterns
  • HTML Content Analysis: Extract 43+ features from webpage content
  • Combined Predictions: Weighted ensemble of all models
  • Visual Attention Weights: See which tokens influence decisions
  • Real-time Web Scraping: Fetch and analyze live websites
  • Multi-tab Interface: Compare results across different models

πŸš€ Usage

  1. Enter a URL: System will fetch the webpage and analyze both URL structure and content
  2. Enter text: Direct analysis of suspicious text content
  3. Compare Models: Use different tabs to see how each model performs

πŸ“Š Model Performance

  • DeBERTa + LSTM: Best for context understanding with attention visualization
  • BERT: Reliable baseline with robust predictions
  • XGBoost: Fast traditional ML approach with feature interpretability

πŸ”§ Technical Details

  • All models loaded from Hugging Face Hub for easy deployment
  • Feature extraction modules included for XGBoost functionality
  • Dark theme optimized interface with visual analytics
  • Graceful fallbacks if models fail to load

πŸ“ Examples

Try these URLs to see the system in action:

  • https://github.com/user/repo (should be benign)
  • http://suspicious-phishing-site.example (simulated phishing)
  • Or paste any suspicious email content for analysis