Spaces:

glides
/

README

Running

README / index.html

Update index.html

1dfc5eb verified 7 months ago

1.23 kB

	<h2>Usage</h2>

	<p>You can load models using the Hugging Face Transformers library:</p>

	<p style="background-color: gray">
	from transformers import pipeline

	pipe = pipeline("text-generation", model="nroggendorff/mayo")

	question = "What color is the sky?"
	conv = [{"role": "user", "content": question}]

	response = pipe(conv, max_new_tokens=32)[0]['generated_text'][-1]['content']
	print(response)
	</p>

	<p>To use models with quantization:</p>

	<p style="background-color: gray">
	from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
	import torch

	bnb_config = BitsAndBytesConfig(
	load_in_4bit=True,
	bnb_4bit_use_double_quant=True,
	bnb_4bit_quant_type="nf4",
	bnb_4bit_compute_dtype=torch.bfloat16
	)

	model_id = "nroggendorff/mayo"

	tokenizer = AutoTokenizer.from_pretrained(model_id)
	model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config)

	question = "What color is the sky?"
	prompt = tokenizer.apply_chat_template([{"role": "user", "content": question}], tokenize=False)
	inputs = tokenizer(prompt, return_tensors="pt")

	outputs = model.generate(**inputs, max_new_tokens=32)

	generated_text = tokenizer.batch_decode(outputs)[0]
	print(generated_text)
	</p>