OpenManus-RL Model

This model was trained using GRPO on Qwen/Qwen2.5-3B.

Model Details

  • Training type: grpo
  • Base model: Qwen/Qwen2.5-3B
  • Number of epochs: 3
  • Batch size: 8
  • Learning rate: 5e-05

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "openmanus-grpo-qwen"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Generate text
inputs = tokenizer("Hello, I am a language model", return_tensors="pt")
outputs = model.generate(**inputs, max_length=50)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Downloads last month
5
Video Preview
loading