File size: 2,589 Bytes
8bfa9d1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c2c53f0
 
8bfa9d1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
---
language:
- fa
metrics:
- f1
- accuracy
- precision
- recall
base_model:
- sbunlp/fabert
pipeline_tag: text-classification
tags:
- code
---
# **Fine-Tuned FaBERT Model for Formality Classification**

This repository contains a fine-tuned version of **FABERT**, a pre-trained language model designed for **formality classification**. This model has been specifically trained to classify text as **formal** or **informal**, making it ideal for applications in content moderation, social media monitoring, and customer support automation.

## **Model Overview**
- **Architecture:** Built on the **FABERT** model, a transformer-based architecture optimized for NLP tasks.
- **Task:** **Formality Classification** – distinguishing between formal and informal language in text.
- **Fine-Tuning:** The model has been fine-tuned on a custom dataset containing a variety of formal and informal text.

## **Key Features**
- **Multilingual Support:** This model is capable of classifying text in multiple languages, ensuring robustness in diverse linguistic contexts.
- **High Performance:** Fine-tuned to provide accurate predictions for formal vs. informal text classification.
- **Efficient for Deployment:** Optimized for real-time use in environments like social media platforms, content moderation tools, and communication systems.

## **How to Use the Model**

You can use this model in your Python code with the Hugging Face `transformers` library and PyTorch. The following code snippet demonstrates how to tokenize text, make predictions, and classify whether the text is formal or informal.

```python
import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

# Load the pre-trained tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("faimlab/fabert_formality_classifier")
model = AutoModelForSequenceClassification.from_pretrained("faimlab/fabert_formality_classifier")

# Ensure the model runs on GPU if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

# Example input text
input_text = "Please find attached the report for your review."

# Tokenize the input
inputs = tokenizer(input_text, return_tensors="pt", padding=True, truncation=True, max_length=512)

# Move the model and input to GPU if available
inputs = {key: value.to(device) for key, value in inputs.items()}

# Make predictions
with torch.no_grad():
    outputs = model(**inputs)
    logits = outputs.logits

# Get the predicted label
predicted_label = logits.argmax(dim=1).item()
print(f"Predicted Label: {predicted_label}")