Model Card for `Model ID`

This model is a fine-tuned version of deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B using the LoRA (Low-Rank Adaptation) method to adapt the base model on an enhanced version of the GSM8K dataset. The model is designed for English language causal language modeling tasks with a focus on math word problem solving.

🧠 What’s Inside

Base Model: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
LoRA Adaptation: Applied to q_proj and v_proj layers for efficient fine-tuning.
Target Task: Causal language modeling (TaskType.CAUSAL_LM) on math reasoning problems.
Training Dataset: eagle0504/openai-gsm8k-enhanced-using-together-ai-deepseek-train8k-test1k-v1

🛠️ Fine-Tuning Details

This model was fine-tuned using the transformers and peft libraries. The training leveraged the Parameter-Efficient Fine-Tuning (PEFT) technique using LoRA for memory efficiency and faster convergence.

LoRA Configuration

from peft import get_peft_model, LoraConfig, TaskType

lora_config = LoraConfig(
    task_type=TaskType.CAUSAL_LM,
    r=8,
    lora_alpha=16,
    target_modules=["q_proj", "v_proj"],
    lora_dropout=0.1,
    bias="none",
)

model = get_peft_model(model, lora_config)

r=8: Low-rank dimensionality
lora_alpha=16: Scaling factor
lora_dropout=0.1: Regularization
target_modules=["q_proj", "v_proj"]: Applies LoRA to attention layers
bias="none": No bias added to adapted layers

TrainingArguments Configuration

from transformers import TrainingArguments

training_args = TrainingArguments(
    per_device_train_batch_size=1,
    gradient_accumulation_steps=8,
    warmup_steps=100,
    num_train_epochs=3,
    learning_rate=2e-4,
    fp16=True,
    logging_steps=10,
    output_dir="outputs",
    report_to="none",
    remove_unused_columns=False,
)

Effective batch size: 1 × 8 = 8 (via gradient accumulation)
Epochs: 3
Learning rate: 2e-4
Mixed precision: Enabled (fp16=True) for speed and efficiency
Logging: Every 10 steps

💡 Intended Use

This model is intended for educational and research purposes in math word problem solving, reasoning tasks, and language modeling. It can be used as-is or further fine-tuned on domain-specific datasets.

📜 License

This model is released under the MIT License.

🚧 Limitations and Biases

While the base model and fine-tuning procedure are robust, this model inherits limitations from both the original dataset (GSM8K) and the DeepSeek architecture, including potential biases in language and reasoning patterns. Caution is advised when using the model in high-stakes or real-world applications.

eagle0504
/

fine-tuned-DeepSeek-R1-Distill-Qwen-1.5B-openai-gsm8k-enhanced-v1

Model Card for `Model ID`

🧠 What’s Inside

🛠️ Fine-Tuning Details

LoRA Configuration

TrainingArguments Configuration

💡 Intended Use

📜 License

🚧 Limitations and Biases

Model tree for eagle0504/fine-tuned-DeepSeek-R1-Distill-Qwen-1.5B-openai-gsm8k-enhanced-v1

Dataset used to train eagle0504/fine-tuned-DeepSeek-R1-Distill-Qwen-1.5B-openai-gsm8k-enhanced-v1

Model Card for Model ID

🧠 What’s Inside

🛠️ Fine-Tuning Details

LoRA Configuration

TrainingArguments Configuration

💡 Intended Use

📜 License

🚧 Limitations and Biases

Model tree for eagle0504/fine-tuned-DeepSeek-R1-Distill-Qwen-1.5B-openai-gsm8k-enhanced-v1

Dataset used to train eagle0504/fine-tuned-DeepSeek-R1-Distill-Qwen-1.5B-openai-gsm8k-enhanced-v1

Model Card for `Model ID`