YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

🧠 BERT-Spam-Job-Posting-Detection-Model

A BERT-based binary classifier fine-tuned to detect whether a job posting is fake or real. Ideal for job portals, recruitment platforms, and fraud detection in job advertisements.


✨ Model Highlights

  • πŸ“Œ Based on bert-base-uncased
  • πŸ” Fine-tuned on a custom dataset of job postings labeled as fake or real
  • ⚑ Binary classification: Fake Job Posting vs Real Job Posting
  • πŸ’Ύ Lightweight and optimized for CPU and GPU inference

🧠 Intended Uses

  • Automated detection of fraudulent job postings
  • Job board moderation and quality control
  • Enhancing recruitment platform security
  • Improving user trust in job marketplaces
  • Regulatory compliance monitoring for job ads

🚫 Limitations

  • Trained primarily on English-language job postings
  • May underperform on postings from less-represented industries or regions
  • Not optimized for job descriptions longer than 128 tokens
  • Not suitable for multilingual or multimedia job posting content

πŸ‹οΈβ€β™‚οΈ Training Details

Field Value
Base Model bert-base-uncased
Dataset Custom labeled job postings
Framework PyTorch with Transformers
Epochs 3
Batch Size 16
Max Length 128 tokens
Optimizer AdamW
Loss CrossEntropyLoss
Device CUDA-enabled GPU

πŸ“Š Evaluation Metrics

Metric Score
Accuracy 0.97
Precision 0.81

πŸš€ Usage

from transformers import BertTokenizerFast, BertForSequenceClassification
import torch

model_name = "AventIQ-AI/BERT-Spam-Job-Posting-Detection-Model"
tokenizer = BertTokenizerFast.from_pretrained(model_name)
model = BertForSequenceClassification.from_pretrained(model_name)
model.eval()

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

def predict_with_bert(text):
    inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=128)
    device = next(model.parameters()).device  # Get model device (cpu or cuda)
    inputs = {k: v.to(device) for k, v in inputs.items()}  
    with torch.no_grad():
        logits = model(**inputs).logits

    predicted_class_id = logits.argmax().item()
    return "Fake Job" if predicted_class_id == 1 else "Real Job"

# Example
print(predict_with_bert("Hiring remote data entry clerk for a large online project. Apply now."))
print(predict_with_bert("Looking for a Software Engineer with 5+ years of experience in Python."))

πŸ—‚ Repository Structure

.
β”œβ”€β”€ model/               # Quantized model files
β”œβ”€β”€ tokenizer_config/    # Tokenizer and vocab files
β”œβ”€β”€ model.safensors/     # Fine-tuned model in safetensors format
β”œβ”€β”€ README.md            # Model card

🀝 Contributing

Contributions, issues, and feature requests are welcome! Feel free to open a pull request or raise an issue.

Downloads last month
39
Safetensors
Model size
109M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support