DeepakKumarMSL's picture
Create README.md
63f08fe verified

Custom BERT NER Model

This repository contains a BERT-based Named Entity Recognition (NER) model fine-tuned on the CoNLL-2003 dataset. The model is trained to identify common named entity types such as persons, organizations, locations, and miscellaneous entities.


Model Details

  • Model architecture: BERT (bert-base-cased)
  • Task: Token classification / Named Entity Recognition (NER)
  • Training data: CoNLL-2003 dataset (~14,000 training samples)
  • Number of epochs: 5
  • Framework: Hugging Face Transformers + Datasets
  • Device: CUDA-enabled GPU for training and inference
  • WandB: Disabled during training

Usage

You can use this model for token classification to identify named entities in your text.

Installation

pip install transformers datasets torch

Load the model and tokenizer


from transformers import BertTokenizerFast, BertForTokenClassification
import torch

model_name_or_path = "AventIQ-AI/Custom-BERT-NER-Model"

tokenizer = BertTokenizerFast.from_pretrained(model_name_or_path)
model = BertForTokenClassification.from_pretrained(model_name_or_path)

model.to("cuda")  # or "cpu"
model.eval()

Example inference


text = "Hi, I am Deepak and I am living in Delhi."

tokens = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model(**tokens)
predictions = torch.argmax(outputs.logits, dim=2)

labels = [model.config.id2label[p.item()] for p in predictions[0]]
for token, label in zip(tokenizer.tokenize(text), labels):
print(f"{token}: {label}")

Training Details

  • Dataset: CoNLL-2003, loaded via the Hugging Face datasets library

  • Optimizer: AdamW

  • Learning Rate: 5e-5

  • Batch Size: 16

  • Max Sequence Length: 128

  • Epochs: 5

  • Evaluation: Performed on validation split (if applicable)

  • Quantization: Applied post-training for model size reduction (optional)

Limitations

  • The model may not generalize well to unseen entity types or domains outside CoNLL-2003.

  • It can occasionally mislabel entities, especially for rare or new names.

  • A CUDA-enabled GPU is required for efficient training and inference.