Custom BERT NER Model

This repository contains a BERT-based Named Entity Recognition (NER) model fine-tuned on the CoNLL-2003 dataset. The model is trained to identify common named entity types such as persons, organizations, locations, and miscellaneous entities.

Model Details

Model architecture: BERT (bert-base-cased)
Task: Token classification / Named Entity Recognition (NER)
Training data: CoNLL-2003 dataset (~14,000 training samples)
Number of epochs: 5
Framework: Hugging Face Transformers + Datasets
Device: CUDA-enabled GPU for training and inference
WandB: Disabled during training

Usage

You can use this model for token classification to identify named entities in your text.

Installation

pip install transformers datasets torch

Load the model and tokenizer


from transformers import BertTokenizerFast, BertForTokenClassification
import torch

model_name_or_path = "AventIQ-AI/Custom-BERT-NER-Model"

tokenizer = BertTokenizerFast.from_pretrained(model_name_or_path)
model = BertForTokenClassification.from_pretrained(model_name_or_path)

model.to("cuda")  # or "cpu"
model.eval()

Example inference


text = "Hi, I am Deepak and I am living in Delhi."

tokens = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model(**tokens)
predictions = torch.argmax(outputs.logits, dim=2)

labels = [model.config.id2label[p.item()] for p in predictions[0]]
for token, label in zip(tokenizer.tokenize(text), labels):
print(f"{token}: {label}")

Training Details

Dataset: CoNLL-2003, loaded via the Hugging Face datasets library
Optimizer: AdamW
Learning Rate: 5e-5
Batch Size: 16
Max Sequence Length: 128
Epochs: 5
Evaluation: Performed on validation split (if applicable)
Quantization: Applied post-training for model size reduction (optional)

Limitations

The model may not generalize well to unseen entity types or domains outside CoNLL-2003.
It can occasionally mislabel entities, especially for rare or new names.
A CUDA-enabled GPU is required for efficient training and inference.