Transformers
Safetensors
English

Model Card for Model ID

This model is a fine-tuned version of deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B using the LoRA (Low-Rank Adaptation) method to adapt the base model on an enhanced version of the GSM8K dataset. The model is designed for English language causal language modeling tasks with a focus on math word problem solving.


🧠 What’s Inside


🛠️ Fine-Tuning Details

This model was fine-tuned using the transformers and peft libraries. The training leveraged the Parameter-Efficient Fine-Tuning (PEFT) technique using LoRA for memory efficiency and faster convergence.

LoRA Configuration

from peft import get_peft_model, LoraConfig, TaskType

lora_config = LoraConfig(
    task_type=TaskType.CAUSAL_LM,
    r=8,
    lora_alpha=16,
    target_modules=["q_proj", "v_proj"],
    lora_dropout=0.1,
    bias="none",
)

model = get_peft_model(model, lora_config)
  • r=8: Low-rank dimensionality
  • lora_alpha=16: Scaling factor
  • lora_dropout=0.1: Regularization
  • target_modules=["q_proj", "v_proj"]: Applies LoRA to attention layers
  • bias="none": No bias added to adapted layers

TrainingArguments Configuration

from transformers import TrainingArguments

training_args = TrainingArguments(
    per_device_train_batch_size=1,
    gradient_accumulation_steps=8,
    warmup_steps=100,
    num_train_epochs=3,
    learning_rate=2e-4,
    fp16=True,
    logging_steps=10,
    output_dir="outputs",
    report_to="none",
    remove_unused_columns=False,
)
  • Effective batch size: 1 × 8 = 8 (via gradient accumulation)
  • Epochs: 3
  • Learning rate: 2e-4
  • Mixed precision: Enabled (fp16=True) for speed and efficiency
  • Logging: Every 10 steps

💡 Intended Use

This model is intended for educational and research purposes in math word problem solving, reasoning tasks, and language modeling. It can be used as-is or further fine-tuned on domain-specific datasets.


📜 License

This model is released under the MIT License.


🚧 Limitations and Biases

While the base model and fine-tuning procedure are robust, this model inherits limitations from both the original dataset (GSM8K) and the DeepSeek architecture, including potential biases in language and reasoning patterns. Caution is advised when using the model in high-stakes or real-world applications.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for eagle0504/fine-tuned-DeepSeek-R1-Distill-Qwen-1.5B-openai-gsm8k-enhanced-v1

Finetuned
(469)
this model

Dataset used to train eagle0504/fine-tuned-DeepSeek-R1-Distill-Qwen-1.5B-openai-gsm8k-enhanced-v1