--- library_name: transformers tags: - custom_generate --- ## Description Example repository used to document `generate` from the hub. It is a simplified implementation of greedy decoding. ## Base model: `Qwen/Qwen2.5-0.5B-Instruct` ## Model compatibility Most models. More specifically, any `transformer` LLM/VLM trained for causal language modeling. ## Additional Arguments `left_padding` (`int`, *optional*): number of padding tokens to add before the provided input ## Output Type changes (none) ## Example usage ```py from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-0.5B-Instruct") model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-0.5B-Instruct", device_map="auto") inputs = tokenizer(["The quick brown"], return_tensors="pt").to(model.device) # There is a print message hardcoded in the custom generation method gen_out = model.generate(**inputs, left_padding=5, custom_generate="transformers-community/custom_generate_example", trust_remote_code=True) print(tokenizer.batch_decode(gen_out)) # don't skip special tokens #['<|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|>The quick brown fox jumps over the lazy dog.\n\nThe sentence "The quick'] ```