|
--- |
|
license: apache-2.0 |
|
language: en |
|
library_name: transformers |
|
pipeline_tag: text-classification |
|
tags: |
|
- text-classification |
|
- distilbert |
|
- log-analysis |
|
- openstack |
|
- fine-tuned |
|
widget: |
|
- text: "Instance 1234 has failed to connect to the network" |
|
--- |
|
|
|
|
|
# INFRNCE BERT Log Classification Model |
|
|
|
This is a fine-tuned DistilBERT model for classifying OpenStack Nova log entries into different operational categories. |
|
|
|
## Model Details |
|
|
|
- **Base Model**: distilbert-base-uncased |
|
- **Task**: Multi-class text classification |
|
- **Number of Labels**: 6 |
|
- **Domain**: OpenStack log analysis |
|
|
|
## Labels |
|
|
|
The model classifies logs into the following categories: |
|
|
|
- Error_Handling, - Instance_Management, - Network_Operations, - Resource_Management, - Scheduler_Operations, - System_Operations |
|
|
|
## Usage |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
import torch |
|
|
|
# Load the model and tokenizer |
|
tokenizer = AutoTokenizer.from_pretrained("your-username/infrnce-bert-log-classifier") |
|
model = AutoModelForSequenceClassification.from_pretrained("your-username/infrnce-bert-log-classifier") |
|
|
|
# Example usage |
|
log_text = "Your OpenStack log entry here" |
|
inputs = tokenizer(log_text, return_tensors="pt", truncation=True, padding=True, max_length=512) |
|
|
|
with torch.no_grad(): |
|
outputs = model(**inputs) |
|
predictions = torch.nn.functional.softmax(outputs.logits, dim=-1) |
|
predicted_class_id = predictions.argmax().item() |
|
|
|
print(f"Predicted class: {model.config.id2label[predicted_class_id]}") |
|
``` |
|
|
|
## Training Data |
|
|
|
The model was trained on a curated dataset of OpenStack Nova logs with both regex-based classifications and semantic clustering. |
|
|
|
## Performance |
|
|
|
The model was trained with controlled accuracy to achieve optimal performance on log classification tasks. |
|
|