Add a heading.png

Segue-Qwen3_DeepScaleR-Preview

Segue-Qwen3_DeepScaleR-Preview is an experimental fine-tuned variant of the Qwen3-4B model architecture. It is trained on the DeepScaleR-Preview dataset—comprising high-quality mathematical reasoning problems—to achieve exceptional performance in symbolic, mathematical, and logical tasks with lightweight computational requirements.

Key Features

  1. Precision Reasoning with DeepScaleR-Preview Dataset Fine-tuned on approximately 40,000 curated math problem-answer pairs sourced from:

    • AIME (1984–2023)
    • AMC (pre-2023)
    • Omni-MATH This enables superior symbolic manipulation and step-by-step logical deduction.
  2. Lightweight Code Understanding Capable of interpreting and generating correct code in Python, C++, and other logic-intensive languages with an emphasis on problem-solving and structured thought.

  3. Structured Output Formatting Outputs are designed to be well-formatted in Markdown, JSON, LaTeX, or tables—ideal for technical documentation, math notebooks, and data workflows.

  4. Instruction-Following Accuracy Strong multi-step instruction adherence, particularly for STEM domains. Ensures continuity, factual correctness, and process transparency in reasoning chains.

  5. Multilingual Capabilities Supports over 20 languages for mathematical and logical reasoning, technical instruction translation, and cross-lingual academic support.

  6. Efficient 4B Architecture Built on the Qwen3-4B base model to balance performance and scalability. Runs efficiently on mid-range GPUs while delivering high-accuracy inference.

Quickstart with Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "prithivMLmods/Segue-Qwen3_DeepScaleR-Preview"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "Solve for x: 5(x - 2) = 3x + 4, showing all steps clearly."

messages = [
    {"role": "system", "content": "You are a precise mathematical assistant trained on DeepScaleR-Preview dataset."},
    {"role": "user", "content": prompt}
]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)

Intended Use

  • Step-by-step mathematical problem solving
  • Symbolic computation and logic derivation
  • Code generation and correction in technical environments
  • Automated LaTeX/Markdown/JSON generation for education and documentation
  • Academic tutoring and educational assistants
  • Multilingual reasoning and translation of structured content

Limitations

  • Less suitable for open-domain conversation or creative writing
  • Smaller context window compared to large-scale LLMs
  • May be sensitive to token formatting in edge-case symbolic prompts
  • Could underperform on intentionally adversarial logic inputs

References

  1. Qwen2.5 Technical Report – https://arxiv.org/pdf/2412.15115
  2. YaRN: Context Window Extension for LLMs – https://arxiv.org/pdf/2309.00071
Downloads last month
25
Safetensors
Model size
4.02B params
Tensor type
FP16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for prithivMLmods/Segue-Qwen3_DeepScaleR-Preview

Base model

Qwen/Qwen3-4B-Base
Finetuned
Qwen/Qwen3-4B
Finetuned
(40)
this model
Merges
1 model
Quantizations
3 models

Dataset used to train prithivMLmods/Segue-Qwen3_DeepScaleR-Preview

Collection including prithivMLmods/Segue-Qwen3_DeepScaleR-Preview