Make model config compatible with Hugging Face MiniMax implementation

#39
by geetu040 - opened

This PR updates the model configuration to be compatible with the upstream Hugging Face transformers implementation of the MiniMax architecture, introduced in transformers#35831. With these changes, the model can now be loaded without relying on trust_remote_code=True.

This enables easier and safer usage via the standard AutoModelForCausalLM interface:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "MiniMaxAI/MiniMax-Text-01-hf",
    revision="refs/pr/39",
    device_map="auto"
)

tokenizer = AutoTokenizer.from_pretrained(
    "MiniMaxAI/MiniMax-Text-01-hf",
    revision="refs/pr/39"
)

prompt = "Hello!"
messages = [
    {"role": "system", "content": [{"type": "text", "text": "You are a helpful assistant created by MiniMax based on MiniMax-Text-01 model."}]},
    {"role": "user", "content": [{"type": "text", "text": prompt}]},
]

model_inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to("cuda")

generated_ids = model.generate(model_inputs, max_new_tokens=100, do_sample=True)
tokenizer.batch_decode(generated_ids)[0]
geetu040 changed pull request title from Update for huggingface/transformers compatible to Make model config compatible with Hugging Face MiniMax implementation

Hi @geetu040 , thanks a lot for opening the PR to add Transformers support! We’ve hit a couple of config-parameter clashes in the main repo, so we’re spinning up a dedicated HF version for this integration. Could you please retarget your PR to MiniMax-Text-01-hf instead? Appreciate your help!

sriting changed pull request status to merged

FYI this is a breaking change for existing users in vLLM who used this model with the old format https://github.com/vllm-project/vllm/issues/20198

We may want to keep around this format and just point HF users to the new "-hf" version

Sign up or log in to comment