boltuix
/

EntityBERT

Model card Files Files and versions Community

boltuix commited on 17 days ago

Commit

ceaa928

verified ·

1 Parent(s): a1b3fd2

Update README.md

Browse files

Files changed (1) hide show

README.md +13 -13

README.md CHANGED Viewed

@@ -44,12 +44,12 @@ base_model:
 ![Banner](https://via.placeholder.com/1200x400.png?text=EntityBERT+NER+Model)
-# 🌟 EntityBERT-NER Model 🌟
 ## 🚀 Model Details
 ### 🌈 Description
-The `boltuix/EntityBERT-NER` model is a fine-tuned transformer for **Named Entity Recognition (NER)**, built on the lightweight `boltuix/bert-mini` base model. It excels at identifying 36 entity types, including people, locations, organizations, dates, times, phone numbers, emails, URLs, and more, in English text. Designed for efficiency and high accuracy, it’s perfect for real-time applications like information extraction, chatbots, and knowledge graph construction across domains such as travel, medical, logistics, and education.
 - **Dataset**: [boltuix/conll2025-ner](https://huggingface.co/datasets/boltuix/conll2025-ner) (~143,709 entries, 6.38 MB)
 - **Entity Types**: 36 NER tags (18 entity categories with B-/I- tags + O)
@@ -68,7 +68,7 @@ The `boltuix/EntityBERT-NER` model is a fine-tuned transformer for **Named Entit
 - **Parameters**: ~11M
 ### 🔗 Links
-- **Model Repository**: [boltuix/EntityBERT-NER](https://huggingface.co/boltuix/EntityBERT-NER)
 - **Dataset**: [boltuix/conll2025-ner](#download-instructions)
 - **Hugging Face Docs**: [Transformers](https://huggingface.co/docs/transformers)
 - **Demo**: Available at [boltuix.github.io/demo](https://boltuix.github.io/demo) (coming soon)
@@ -102,8 +102,8 @@ Use the model for NER with the following Python code:
 from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline
 # Load model and tokenizer
-tokenizer = AutoTokenizer.from_pretrained("boltuix/EntityBERT-NER")
-model = AutoModelForTokenClassification.from_pretrained("boltuix/EntityBERT-NER")
 # Create NER pipeline
 nlp = pipeline("token-classification", model=model, tokenizer=tokenizer)
@@ -231,7 +231,7 @@ These high scores demonstrate the model’s ability to accurately identify entit
 ## 🧠 Training the Model
-Fine-tune the `boltuix/bert-mini` model on the `boltuix/conll2025-ner` dataset to replicate or extend `EntityBERT-NER`. Below is a training script:
 ```python
 # Install dependencies
@@ -296,7 +296,7 @@ model = AutoModelForTokenClassification.from_pretrained("boltuix/bert-mini", num
 # Training arguments
 args = TrainingArguments(
-    output_dir="boltuix/entitybert-ner",
     eval_strategy="epoch",
     learning_rate=2e-5,
     per_device_train_batch_size=16,
@@ -347,8 +347,8 @@ trainer = Trainer(
 trainer.train()
 # Save model
-trainer.save_model("boltuix/entitybert-ner")
-tokenizer.save_pretrained("boltuix/entitybert-ner")
 ```
 ### 🛠️ Tips
@@ -381,7 +381,7 @@ pip install transformers torch pandas pyarrow seqeval
 - **Optional**: NVIDIA CUDA for GPU acceleration
 ### Download Instructions 📥
-- **Model**: [boltuix/EntityBERT-NER](https://huggingface.co/boltuix/EntityBERT-NER)
 - **Dataset**: [boltuix/conll2025-ner](https://huggingface.co/datasets/boltuix/conll2025-ner)
 - Load with Hugging Face `datasets` or pandas.
@@ -394,7 +394,7 @@ Evaluate the model on custom data:
 from transformers import pipeline
 # Load NER pipeline
-nlp = pipeline("token-classification", model="boltuix/EntityBERT-NER")
 # Test data
 text = "Book a Lyft from Metropolis on December 1, 2025, contact [email protected]."
@@ -468,7 +468,7 @@ plt.show()
 ## ⚖️ Comparison to Other Models
 | Model                | Dataset            | Parameters | F1 Score | Size   |
 |----------------------|--------------------|------------|----------|--------|
-| **EntityBERT-NER**   | conll2025-ner      | ~11M       | 0.89     | ~50 MB |
 | BERT-base-NER        | CoNLL-2003         | ~110M      | ~0.89    | ~400 MB|
 | DistilBERT-NER       | CoNLL-2003         | ~66M       | ~0.85    | ~200 MB|
@@ -481,7 +481,7 @@ plt.show()
 ## 🌐 Community and Support
 - 📍 Explore: [Hugging Face Community](https://huggingface.co/community)
-- 🛠️ Contribute: [boltuix/EntityBERT-NER](https://huggingface.co/boltuix/EntityBERT-NER)
 - 💬 Discuss: [Hugging Face Forums](https://huggingface.co/discussions)
 - 📚 Learn: [Transformers Docs](https://huggingface.co/docs/transformers)
 - 📧 Contact: Boltuix at [[email protected]](mailto:[email protected])

 ![Banner](https://via.placeholder.com/1200x400.png?text=EntityBERT+NER+Model)
+# 🌟 EntityBERT Model 🌟
 ## 🚀 Model Details
 ### 🌈 Description
+The `boltuix/EntityBERT` model is a fine-tuned transformer for **Named Entity Recognition (NER)**, built on the lightweight `boltuix/bert-mini` base model. It excels at identifying 36 entity types, including people, locations, organizations, dates, times, phone numbers, emails, URLs, and more, in English text. Designed for efficiency and high accuracy, it’s perfect for real-time applications like information extraction, chatbots, and knowledge graph construction across domains such as travel, medical, logistics, and education.
 - **Dataset**: [boltuix/conll2025-ner](https://huggingface.co/datasets/boltuix/conll2025-ner) (~143,709 entries, 6.38 MB)
 - **Entity Types**: 36 NER tags (18 entity categories with B-/I- tags + O)
 - **Parameters**: ~11M
 ### 🔗 Links
+- **Model Repository**: [boltuix/EntityBERT](https://huggingface.co/boltuix/EntityBERT)
 - **Dataset**: [boltuix/conll2025-ner](#download-instructions)
 - **Hugging Face Docs**: [Transformers](https://huggingface.co/docs/transformers)
 - **Demo**: Available at [boltuix.github.io/demo](https://boltuix.github.io/demo) (coming soon)
 from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline
 # Load model and tokenizer
+tokenizer = AutoTokenizer.from_pretrained("boltuix/EntityBERT")
+model = AutoModelForTokenClassification.from_pretrained("boltuix/EntityBERT")
 # Create NER pipeline
 nlp = pipeline("token-classification", model=model, tokenizer=tokenizer)
 ## 🧠 Training the Model
+Fine-tune the `boltuix/bert-mini` model on the `boltuix/conll2025-ner` dataset to replicate or extend `EntityBERT`. Below is a training script:
 ```python
 # Install dependencies
 # Training arguments
 args = TrainingArguments(
+    output_dir="boltuix/EntityBERT",
     eval_strategy="epoch",
     learning_rate=2e-5,
     per_device_train_batch_size=16,
 trainer.train()
 # Save model
+trainer.save_model("boltuix/EntityBERT")
+tokenizer.save_pretrained("boltuix/EntityBERT")
 ```
 ### 🛠️ Tips
 - **Optional**: NVIDIA CUDA for GPU acceleration
 ### Download Instructions 📥
+- **Model**: [boltuix/EntityBERT](https://huggingface.co/boltuix/EntityBERT)
 - **Dataset**: [boltuix/conll2025-ner](https://huggingface.co/datasets/boltuix/conll2025-ner)
 - Load with Hugging Face `datasets` or pandas.
 from transformers import pipeline
 # Load NER pipeline
+nlp = pipeline("token-classification", model="boltuix/EntityBERT")
 # Test data
 text = "Book a Lyft from Metropolis on December 1, 2025, contact [email protected]."
 ## ⚖️ Comparison to Other Models
 | Model                | Dataset            | Parameters | F1 Score | Size   |
 |----------------------|--------------------|------------|----------|--------|
+| **EntityBERT**   | conll2025-ner      | ~11M       | 0.89     | ~50 MB |
 | BERT-base-NER        | CoNLL-2003         | ~110M      | ~0.89    | ~400 MB|
 | DistilBERT-NER       | CoNLL-2003         | ~66M       | ~0.85    | ~200 MB|
 ## 🌐 Community and Support
 - 📍 Explore: [Hugging Face Community](https://huggingface.co/community)
+- 🛠️ Contribute: [boltuix/EntityBERT](https://huggingface.co/boltuix/EntityBERT)
 - 💬 Discuss: [Hugging Face Forums](https://huggingface.co/discussions)
 - 📚 Learn: [Transformers Docs](https://huggingface.co/docs/transformers)
 - 📧 Contact: Boltuix at [[email protected]](mailto:[email protected])