OpenManus-RL Model
This model was trained using GRPO on Qwen/Qwen2.5-3B.
Model Details
- Training type: grpo
- Base model: Qwen/Qwen2.5-3B
- Number of epochs: 3
- Batch size: 8
- Learning rate: 5e-05
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "openmanus-grpo-qwen"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
# Generate text
inputs = tokenizer("Hello, I am a language model", return_tensors="pt")
outputs = model.generate(**inputs, max_length=50)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
- Downloads last month
- 5