🧠 BERT-Spam-Job-Posting-Detection-Model

A BERT-based binary classifier fine-tuned to detect whether a job posting is fake or real. Ideal for job portals, recruitment platforms, and fraud detection in job advertisements.

✨ Model Highlights

📌 Based on bert-base-uncased
🔍 Fine-tuned on a custom dataset of job postings labeled as fake or real
⚡ Binary classification: Fake Job Posting vs Real Job Posting
💾 Lightweight and optimized for CPU and GPU inference

🧠 Intended Uses

Automated detection of fraudulent job postings
Job board moderation and quality control
Enhancing recruitment platform security
Improving user trust in job marketplaces
Regulatory compliance monitoring for job ads

🚫 Limitations

Trained primarily on English-language job postings
May underperform on postings from less-represented industries or regions
Not optimized for job descriptions longer than 128 tokens
Not suitable for multilingual or multimedia job posting content

🏋️‍♂️ Training Details

Field	Value
Base Model	`bert-base-uncased`
Dataset	Custom labeled job postings
Framework	PyTorch with Transformers
Epochs	3
Batch Size	16
Max Length	128 tokens
Optimizer	AdamW
Loss	CrossEntropyLoss
Device	CUDA-enabled GPU

📊 Evaluation Metrics

Metric	Score
Accuracy	0.97
Precision	0.81

🚀 Usage

from transformers import BertTokenizerFast, BertForSequenceClassification
import torch

model_name = "AventIQ-AI/BERT-Spam-Job-Posting-Detection-Model"
tokenizer = BertTokenizerFast.from_pretrained(model_name)
model = BertForSequenceClassification.from_pretrained(model_name)
model.eval()

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

def predict_with_bert(text):
    inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=128)
    device = next(model.parameters()).device  # Get model device (cpu or cuda)
    inputs = {k: v.to(device) for k, v in inputs.items()}  
    with torch.no_grad():
        logits = model(**inputs).logits

    predicted_class_id = logits.argmax().item()
    return "Fake Job" if predicted_class_id == 1 else "Real Job"

# Example
print(predict_with_bert("Hiring remote data entry clerk for a large online project. Apply now."))
print(predict_with_bert("Looking for a Software Engineer with 5+ years of experience in Python."))

🗂 Repository Structure

.
├── model/               # Quantized model files
├── tokenizer_config/    # Tokenizer and vocab files
├── model.safensors/     # Fine-tuned model in safetensors format
├── README.md            # Model card

🤝 Contributing

Contributions, issues, and feature requests are welcome! Feel free to open a pull request or raise an issue.