Custom BERT NER Model
This repository contains a BERT-based Named Entity Recognition (NER) model fine-tuned on the CoNLL-2003 dataset. The model is trained to identify common named entity types such as persons, organizations, locations, and miscellaneous entities.
Model Details
- Model architecture: BERT (bert-base-cased)
- Task: Token classification / Named Entity Recognition (NER)
- Training data: CoNLL-2003 dataset (~14,000 training samples)
- Number of epochs: 5
- Framework: Hugging Face Transformers + Datasets
- Device: CUDA-enabled GPU for training and inference
- WandB: Disabled during training
Usage
You can use this model for token classification to identify named entities in your text.
Installation
pip install transformers datasets torch
Load the model and tokenizer
from transformers import BertTokenizerFast, BertForTokenClassification
import torch
model_name_or_path = "AventIQ-AI/Custom-BERT-NER-Model"
tokenizer = BertTokenizerFast.from_pretrained(model_name_or_path)
model = BertForTokenClassification.from_pretrained(model_name_or_path)
model.to("cuda") # or "cpu"
model.eval()
Example inference
text = "Hi, I am Deepak and I am living in Delhi."
tokens = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model(**tokens)
predictions = torch.argmax(outputs.logits, dim=2)
labels = [model.config.id2label[p.item()] for p in predictions[0]]
for token, label in zip(tokenizer.tokenize(text), labels):
print(f"{token}: {label}")
Training Details
Dataset: CoNLL-2003, loaded via the Hugging Face datasets library
Optimizer: AdamW
Learning Rate: 5e-5
Batch Size: 16
Max Sequence Length: 128
Epochs: 5
Evaluation: Performed on validation split (if applicable)
Quantization: Applied post-training for model size reduction (optional)
Limitations
The model may not generalize well to unseen entity types or domains outside CoNLL-2003.
It can occasionally mislabel entities, especially for rare or new names.
A CUDA-enabled GPU is required for efficient training and inference.