---
model-index:
- name: TSLAM-Mini-2B
  results:
    - task:
        type: domain-modeling
      dataset:
        type: telecom-eval-suite
        name: Telecom Internal Benchmark
      metrics:
        - type: accuracy
          value: 88.2

    - task:
        type: open-domain-qa
      dataset:
        type: bigbench-hard
        name: BIG-Bench Hard
      metrics:
        - type: accuracy
          value: 70.4

    - task:
        type: question-answering
      dataset:
        type: mmlu
        name: MMLU
      metrics:
        - type: accuracy
          value: 67.3

    - task:
        type: commonsense-reasoning
      dataset:
        type: arc
        name: ARC Challenge
      metrics:
        - type: accuracy
          value: 83.7

    - task:
        type: boolean-classification
      dataset:
        type: boolq
        name: BoolQ
      metrics:
        - type: accuracy
          value: 81.2

    - task:
        type: question-answering
      dataset:
        type: gpqa
        name: GPQA
      metrics:
        - type: accuracy
          value: 25.2

    - task:
        type: commonsense-reasoning
      dataset:
        type: hellaswag
        name: HellaSwag
      metrics:
        - type: accuracy
          value: 69.1

    - task:
        type: open-book-qa
      dataset:
        type: openbookqa
        name: OpenBookQA
      metrics:
        - type: accuracy
          value: 79.2

    - task:
        type: physical-reasoning
      dataset:
        type: piqa
        name: PIQA
      metrics:
        - type: accuracy
          value: 77.6

    - task:
        type: social-intelligence
      dataset:
        type: socialiqa
        name: Social IQa
      metrics:
        - type: accuracy
          value: 72.5

    - task:
        type: truthfulness
      dataset:
        type: truthfulqa
        name: TruthfulQA
      metrics:
        - type: accuracy
          value: 66.4

    - task:
        type: winograd-schema
      dataset:
        type: winogrande
        name: WinoGrande
      metrics:
        - type: accuracy
          value: 67.0

    - task:
        type: question-answering
      dataset:
        type: mmlu
        name: MMLU (Multilingual)
      metrics:
        - type: accuracy
          value: 49.3

    - task:
        type: mathematical-reasoning
      dataset:
        type: mgsm
        name: MGSM
      metrics:
        - type: accuracy
          value: 63.9

    - task:
        type: mathematical-reasoning
      dataset:
        type: gsm8k
        name: GSM8K
      metrics:
        - type: accuracy
          value: 88.6

    - task:
        type: mathematical-reasoning
      dataset:
        type: math
        name: MATH
      metrics:
        - type: accuracy
          value: 64.0

extra_gated_prompt: "Please provide answers to the below questions to gain access to the model"
extra_gated_fields:
  Company: text
  Full Name: text
  Email: text
  I want to use this model for:
    type: select
    options: 
      - Research
      - Education
      - Commercial
      - label: Other
        value: other
---

# TSLAM-Mini-2B

**Base Model**: [`microsoft/Phi-4-mini-instruct`](https://huggingface.co/microsoft/Phi-4-mini-instruct)  
**License**: MIT

## Overview

**TSLAM-Mini-2B** is a domain-adapted language model fine-tuned on 100,000 telecom-specific examples, designed to emulate the intelligence and conversational expertise of a Telecom Subject Matter Expert (SME). Built on top of the Phi-4-mini-instruct foundation, TSLAM-Mini-2B is optimized for real-time, industry-grade interactions across key telecom scenarios, including:

- SME-style responses in customer support and internal queries
- Network configuration, diagnostics, and troubleshooting workflows
- Device provisioning and service activation dialogues
- Operational support for field and NOC teams
- Intelligent retrieval and summarization of telecom-specific documentation

This fine-tuning strategy enables TSLAM-Mini-2B to reason like an SME, offering accurate, context-aware responses that align with real-world telecom operations.
Though this model offers superior performance on Telecom specific usecases, For enterprises requiring specialized capabilities please contact us support@netoai.ai for our enterprise grade commercial models which offers greater capabilities required for production.

## Key Features

- **Telecom-Tuned**: Finetuned on domain-specific conversations, logs, and structured dialogues.
- **Instruction-Following**: Retains Phi-4’s compact instruction-tuned behavior while adapting to industry-specific patterns.
- **Real-Time Scenarios**: Performs well in use cases that require contextual understanding of real-world telecom operations.

## Intended Use
  The areas TSLAM-Mini-2B excels in are:
  
- **Customer Support Agents** (AI copilots or chatbots)  
- **Network Operations Tools** that process commands or log queries  
- **Internal Assistants** for engineers and field technicians  
- **Telecom Knowledge Graphs & RAG Pipelines**

## Model Details

| Property         | Value                                  |
|------------------|----------------------------------------|
| Base Model       | `microsoft/Phi-4-mini-instruct`        |
| Fine-tuning Data | 100k telecom domain examples           |
| Training Method  | Supervised fine-tuning (SFT)           |
| License          | MIT                                    |


## Benchmarks Results

| **Benchmark**                  | **TSLAM-Mini-2B** | Phi-3.5-mini-Ins | Llama-3.2-3B-Ins | Mistral-3B | Qwen2.5-3B-Ins | Qwen2.5-7B-Ins | Mistral-8B-2410 | Llama-3.1-8B-Ins | Llama-3.1-Tulu-3-8B | Gemma2-9B-Ins | GPT-4o-mini-2024-07-18 |
|-------------------------------|-----------------|------------------|------------------|------------|----------------|----------------|------------------|-------------------|----------------------|---------------|-------------------------|
| **Popular aggregated benchmark** |                 |                  |                  |            |                |                |                  |                   |                      |               |                         |
| Arena Hard                    | 32.8            | 34.4             | 17.0             | 26.9       | 32.0           | 55.5           | 37.3             | 25.7              | 42.7                 | 43.7          | 53.7                    |
| BigBench Hard (0-shot, CoT)   | 70.4            | 63.1             | 55.4             | 51.2       | 56.2           | 72.4           | 53.3             | 63.4              | 55.5                 | 65.7          | 80.4                    |
| MMLU (5-shot)                | 67.3            | 65.5             | 61.8             | 60.8       | 65.0           | 72.6           | 63.0             | 68.1              | 65.0                 | 71.3          | 77.2                    |
| MMLU-Pro (0-shot, CoT)        | 52.8            | 47.4             | 39.2             | 35.3       | 44.7           | 56.2           | 36.6             | 44.0              | 40.9                 | 50.1          | 62.8                    |
| **Reasoning**                 |                 |                  |                  |            |                |                |                  |                   |                      |               |                         |
| ARC Challenge (10-shot)       | 83.7            | 84.6             | 76.1             | 80.3       | 82.6           | 90.1           | 82.7             | 83.1              | 79.4                 | 89.8          | 93.5                    |
| BoolQ (2-shot)                | 81.2            | 77.7             | 71.4             | 79.4       | 65.4           | 80.0           | 80.5             | 82.8              | 79.3                 | 85.7          | 88.7                    |
| GPQA (0-shot, CoT)            | 25.2            | 26.6             | 24.3             | 24.4       | 23.4           | 30.6           | 26.3             | 26.3              | 29.9                 | 39.1          | 41.1                    |
| HellaSwag (5-shot)            | 69.1            | 72.2             | 77.2             | 74.6       | 74.6           | 80.0           | 73.5             | 72.8              | 80.9                 | 87.1          | 88.7                    |
| OpenBookQA (10-shot)          | 79.2            | 81.2             | 72.6             | 79.8       | 79.3           | 82.6           | 80.2             | 84.8              | 79.8                 | 90.0          | 90.0                    |
| PIQA (5-shot)                 | 77.6            | 78.2             | 68.2             | 73.2       | 72.6           | 76.2           | 81.2             | 83.2              | 78.3                 | 83.7          | 88.7                    |
| Social IQA (5-shot)           | 72.5            | 75.1             | 68.3             | 73.9       | 75.3           | 75.3           | 77.6             | 71.8              | 73.4                 | 74.7          | 82.9                    |
| TruthfulQA (MC2) (10-shot)    | 66.4            | 65.2             | 59.2             | 62.9       | 64.3           | 69.4           | 63.0             | 69.2              | 64.1                 | 76.6          | 78.2                    |
| Winogrande (5-shot)           | 67.0            | 72.2             | 53.2             | 59.8       | 63.3           | 71.1           | 63.1             | 64.7              | 65.4                 | 74.0          | 76.9                    |
| **Multilingual**              |                 |                  |                  |            |                |                |                  |                   |                      |               |                         |
| Multilingual MMLU (5-shot)    | 49.3            | 51.8             | 48.1             | 46.4       | 55.9           | 64.4           | 53.7             | 56.2              | 54.5                 | 63.8          | 72.9                    |
| MGSM (0-shot, CoT)            | 63.9            | 49.6             | 44.6             | 44.6       | 53.5           | 64.5           | 56.7             | 56.7              | 58.6                 | 75.1          | 81.7                    |
| **Math**                      |                 |                  |                  |            |                |                |                  |                   |                      |               |                         |
| GSM8K (8-shot, CoT)           | 88.6            | 76.9             | 75.6             | 80.1       | 80.6           | 88.7           | 81.9             | 82.4              | 84.3                 | 84.9          | 91.3                    |
| MATH (0-shot, CoT)            | 64.0            | 49.8             | 46.7             | 41.8       | 61.7           | 60.4           | 41.6             | 47.6              | 46.1                 | 51.3          | 70.2                    |
| **Telecom (domain-specific)** | **88.2**        | 52.1             | 47.6             | 49.3       | 58.0           | 61.5           | 54.9             | 57.3              | 59.0                 | 64.1          | 70.3                    |
| **Overall**                   | **63.5**        | 60.5             | 56.2             | 56.9       | 60.1           | 67.9           | 60.2             | 62.3              | 60.9                 | 65.0          | **75.5**                 |

## Example

```text
**User**: How do I reconfigure a 5G core node remotely?

**Model**: To reconfigure a 5G core node remotely, ensure you have SSH access enabled and the necessary configuration scripts preloaded. From your NOC terminal, run the secure update command with the node's IP and authentication key...
```

## How to Use

```python
from transformers import AutoTokenizer, AutoModelForCausalLM

# Load tokenizer and model
model_name = "NetoAISolutions/TSLAM-Mini-2B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Define the input using the Phi-style chat template
def build_prompt(user_input):
    system = "<|system|>\nYou are a helpful assistant.<|end|>\n"
    user = f"<|user|>\n{user_input}<|end|>\n"
    assistant = "<|assistant|>\n"  # Start assistant response
    return system + user + assistant

# Example input
user_query = "How do I activate VoLTE on a user's device?"
prompt = build_prompt(user_query)

# Tokenize and generate
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(
    **inputs,
    max_new_tokens=200,
    temperature=0.7,
    top_p=0.9,
    do_sample=True,
    eos_token_id=tokenizer.convert_tokens_to_ids("<|end|>")
)

# Decode output
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

```

## Acknowledgements

- Built on top of Microsoft’s [Phi-4-mini-instruct](https://huggingface.co/microsoft/Phi-4-mini-instruct)  
- Data curation and tuning by [NetoAISolutions](https://netoai.ai/)

---