boltuix commited on
Commit
ceaa928
Β·
verified Β·
1 Parent(s): a1b3fd2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -13
README.md CHANGED
@@ -44,12 +44,12 @@ base_model:
44
 
45
  ![Banner](https://via.placeholder.com/1200x400.png?text=EntityBERT+NER+Model)
46
 
47
- # 🌟 EntityBERT-NER Model 🌟
48
 
49
  ## πŸš€ Model Details
50
 
51
  ### 🌈 Description
52
- The `boltuix/EntityBERT-NER` model is a fine-tuned transformer for **Named Entity Recognition (NER)**, built on the lightweight `boltuix/bert-mini` base model. It excels at identifying 36 entity types, including people, locations, organizations, dates, times, phone numbers, emails, URLs, and more, in English text. Designed for efficiency and high accuracy, it’s perfect for real-time applications like information extraction, chatbots, and knowledge graph construction across domains such as travel, medical, logistics, and education.
53
 
54
  - **Dataset**: [boltuix/conll2025-ner](https://huggingface.co/datasets/boltuix/conll2025-ner) (~143,709 entries, 6.38 MB)
55
  - **Entity Types**: 36 NER tags (18 entity categories with B-/I- tags + O)
@@ -68,7 +68,7 @@ The `boltuix/EntityBERT-NER` model is a fine-tuned transformer for **Named Entit
68
  - **Parameters**: ~11M
69
 
70
  ### πŸ”— Links
71
- - **Model Repository**: [boltuix/EntityBERT-NER](https://huggingface.co/boltuix/EntityBERT-NER)
72
  - **Dataset**: [boltuix/conll2025-ner](#download-instructions)
73
  - **Hugging Face Docs**: [Transformers](https://huggingface.co/docs/transformers)
74
  - **Demo**: Available at [boltuix.github.io/demo](https://boltuix.github.io/demo) (coming soon)
@@ -102,8 +102,8 @@ Use the model for NER with the following Python code:
102
  from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline
103
 
104
  # Load model and tokenizer
105
- tokenizer = AutoTokenizer.from_pretrained("boltuix/EntityBERT-NER")
106
- model = AutoModelForTokenClassification.from_pretrained("boltuix/EntityBERT-NER")
107
 
108
  # Create NER pipeline
109
  nlp = pipeline("token-classification", model=model, tokenizer=tokenizer)
@@ -231,7 +231,7 @@ These high scores demonstrate the model’s ability to accurately identify entit
231
 
232
  ## 🧠 Training the Model
233
 
234
- Fine-tune the `boltuix/bert-mini` model on the `boltuix/conll2025-ner` dataset to replicate or extend `EntityBERT-NER`. Below is a training script:
235
 
236
  ```python
237
  # Install dependencies
@@ -296,7 +296,7 @@ model = AutoModelForTokenClassification.from_pretrained("boltuix/bert-mini", num
296
 
297
  # Training arguments
298
  args = TrainingArguments(
299
- output_dir="boltuix/entitybert-ner",
300
  eval_strategy="epoch",
301
  learning_rate=2e-5,
302
  per_device_train_batch_size=16,
@@ -347,8 +347,8 @@ trainer = Trainer(
347
  trainer.train()
348
 
349
  # Save model
350
- trainer.save_model("boltuix/entitybert-ner")
351
- tokenizer.save_pretrained("boltuix/entitybert-ner")
352
  ```
353
 
354
  ### πŸ› οΈ Tips
@@ -381,7 +381,7 @@ pip install transformers torch pandas pyarrow seqeval
381
  - **Optional**: NVIDIA CUDA for GPU acceleration
382
 
383
  ### Download Instructions πŸ“₯
384
- - **Model**: [boltuix/EntityBERT-NER](https://huggingface.co/boltuix/EntityBERT-NER)
385
  - **Dataset**: [boltuix/conll2025-ner](https://huggingface.co/datasets/boltuix/conll2025-ner)
386
  - Load with Hugging Face `datasets` or pandas.
387
 
@@ -394,7 +394,7 @@ Evaluate the model on custom data:
394
  from transformers import pipeline
395
 
396
  # Load NER pipeline
397
- nlp = pipeline("token-classification", model="boltuix/EntityBERT-NER")
398
 
399
  # Test data
400
  text = "Book a Lyft from Metropolis on December 1, 2025, contact [email protected]."
@@ -468,7 +468,7 @@ plt.show()
468
  ## βš–οΈ Comparison to Other Models
469
  | Model | Dataset | Parameters | F1 Score | Size |
470
  |----------------------|--------------------|------------|----------|--------|
471
- | **EntityBERT-NER** | conll2025-ner | ~11M | 0.89 | ~50 MB |
472
  | BERT-base-NER | CoNLL-2003 | ~110M | ~0.89 | ~400 MB|
473
  | DistilBERT-NER | CoNLL-2003 | ~66M | ~0.85 | ~200 MB|
474
 
@@ -481,7 +481,7 @@ plt.show()
481
 
482
  ## 🌐 Community and Support
483
  - πŸ“ Explore: [Hugging Face Community](https://huggingface.co/community)
484
- - πŸ› οΈ Contribute: [boltuix/EntityBERT-NER](https://huggingface.co/boltuix/EntityBERT-NER)
485
  - πŸ’¬ Discuss: [Hugging Face Forums](https://huggingface.co/discussions)
486
  - πŸ“š Learn: [Transformers Docs](https://huggingface.co/docs/transformers)
487
  - πŸ“§ Contact: Boltuix at [[email protected]](mailto:[email protected])
 
44
 
45
  ![Banner](https://via.placeholder.com/1200x400.png?text=EntityBERT+NER+Model)
46
 
47
+ # 🌟 EntityBERT Model 🌟
48
 
49
  ## πŸš€ Model Details
50
 
51
  ### 🌈 Description
52
+ The `boltuix/EntityBERT` model is a fine-tuned transformer for **Named Entity Recognition (NER)**, built on the lightweight `boltuix/bert-mini` base model. It excels at identifying 36 entity types, including people, locations, organizations, dates, times, phone numbers, emails, URLs, and more, in English text. Designed for efficiency and high accuracy, it’s perfect for real-time applications like information extraction, chatbots, and knowledge graph construction across domains such as travel, medical, logistics, and education.
53
 
54
  - **Dataset**: [boltuix/conll2025-ner](https://huggingface.co/datasets/boltuix/conll2025-ner) (~143,709 entries, 6.38 MB)
55
  - **Entity Types**: 36 NER tags (18 entity categories with B-/I- tags + O)
 
68
  - **Parameters**: ~11M
69
 
70
  ### πŸ”— Links
71
+ - **Model Repository**: [boltuix/EntityBERT](https://huggingface.co/boltuix/EntityBERT)
72
  - **Dataset**: [boltuix/conll2025-ner](#download-instructions)
73
  - **Hugging Face Docs**: [Transformers](https://huggingface.co/docs/transformers)
74
  - **Demo**: Available at [boltuix.github.io/demo](https://boltuix.github.io/demo) (coming soon)
 
102
  from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline
103
 
104
  # Load model and tokenizer
105
+ tokenizer = AutoTokenizer.from_pretrained("boltuix/EntityBERT")
106
+ model = AutoModelForTokenClassification.from_pretrained("boltuix/EntityBERT")
107
 
108
  # Create NER pipeline
109
  nlp = pipeline("token-classification", model=model, tokenizer=tokenizer)
 
231
 
232
  ## 🧠 Training the Model
233
 
234
+ Fine-tune the `boltuix/bert-mini` model on the `boltuix/conll2025-ner` dataset to replicate or extend `EntityBERT`. Below is a training script:
235
 
236
  ```python
237
  # Install dependencies
 
296
 
297
  # Training arguments
298
  args = TrainingArguments(
299
+ output_dir="boltuix/EntityBERT",
300
  eval_strategy="epoch",
301
  learning_rate=2e-5,
302
  per_device_train_batch_size=16,
 
347
  trainer.train()
348
 
349
  # Save model
350
+ trainer.save_model("boltuix/EntityBERT")
351
+ tokenizer.save_pretrained("boltuix/EntityBERT")
352
  ```
353
 
354
  ### πŸ› οΈ Tips
 
381
  - **Optional**: NVIDIA CUDA for GPU acceleration
382
 
383
  ### Download Instructions πŸ“₯
384
+ - **Model**: [boltuix/EntityBERT](https://huggingface.co/boltuix/EntityBERT)
385
  - **Dataset**: [boltuix/conll2025-ner](https://huggingface.co/datasets/boltuix/conll2025-ner)
386
  - Load with Hugging Face `datasets` or pandas.
387
 
 
394
  from transformers import pipeline
395
 
396
  # Load NER pipeline
397
+ nlp = pipeline("token-classification", model="boltuix/EntityBERT")
398
 
399
  # Test data
400
  text = "Book a Lyft from Metropolis on December 1, 2025, contact [email protected]."
 
468
  ## βš–οΈ Comparison to Other Models
469
  | Model | Dataset | Parameters | F1 Score | Size |
470
  |----------------------|--------------------|------------|----------|--------|
471
+ | **EntityBERT** | conll2025-ner | ~11M | 0.89 | ~50 MB |
472
  | BERT-base-NER | CoNLL-2003 | ~110M | ~0.89 | ~400 MB|
473
  | DistilBERT-NER | CoNLL-2003 | ~66M | ~0.85 | ~200 MB|
474
 
 
481
 
482
  ## 🌐 Community and Support
483
  - πŸ“ Explore: [Hugging Face Community](https://huggingface.co/community)
484
+ - πŸ› οΈ Contribute: [boltuix/EntityBERT](https://huggingface.co/boltuix/EntityBERT)
485
  - πŸ’¬ Discuss: [Hugging Face Forums](https://huggingface.co/discussions)
486
  - πŸ“š Learn: [Transformers Docs](https://huggingface.co/docs/transformers)
487
  - πŸ“§ Contact: Boltuix at [[email protected]](mailto:[email protected])