File size: 1,263 Bytes
7fd9b52
 
23ad039
 
7fd9b52
 
eba7946
1281c52
eba7946
7fd9b52
 
 
eba7946
 
7fd9b52
 
 
 
 
 
eba7946
 
 
 
 
 
 
 
 
 
ec99db5
eba7946
ec99db5
 
eba7946
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
---
library_name: transformers
tags:
  - custom_generate
---

## Description
Example repository used to document `generate` from the hub. It is a simplified implementation of greedy decoding.

## Base model:
`Qwen/Qwen2.5-0.5B-Instruct`

## Model compatibility
Most models. More specifically, any `transformer` LLM/VLM trained for causal language modeling.

## Additional Arguments
`left_padding` (`int`, *optional*): number of padding tokens to add before the provided input

## Output Type changes
(none)

## Example usage

```py
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-0.5B-Instruct")
model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-0.5B-Instruct", device_map="auto")

inputs = tokenizer(["The quick brown"], return_tensors="pt").to(model.device)
# There is a print message hardcoded in the custom generation method
gen_out = model.generate(**inputs, left_padding=5, custom_generate="transformers-community/custom_generate_example", trust_remote_code=True)
print(tokenizer.batch_decode(gen_out))  # don't skip special tokens
#['<|endoftext|><|endoftext|><|endoftext|><|endoftext|><|endoftext|>The quick brown fox jumps over the lazy dog.\n\nThe sentence "The quick']
```