metadata

license: apache-2.0
base_model:
  - ibm-granite/granite-3.3-8b-instruct

Micro-G3.3-8B-Instruct-1B

Model Summary: Micro-G3.3-8B-Instruct-1B is a 1-billion parameter micro language model fine-tuned for reasoning and instruction-following capabilities. Built on top of Granite-3.3-8B-Instruct, with only 3 hidden layers, this model is trained to maximize performance and hardware compatibility at minimal compute cost.

Generation: This is a simple example of how to use Micro-G3.3-8B-Instruct-1B model.

Install the following libraries:

pip install torch torchvision torchaudio
pip install accelerate
pip install transformers

Then, copy the snippet from the section that is relevant for your use case.

from transformers import AutoModelForCausalLM, AutoTokenizer, set_seed
import torch

model_path="ibm-ai-platform/micro-g3.3-8b-instruct-1b"
device="cuda"
model = AutoModelForCausalLM.from_pretrained(
        model_path,
        device_map=device,
        torch_dtype=torch.bfloat16,
    )
tokenizer = AutoTokenizer.from_pretrained(
        model_path
)

conv = [{"role": "user", "content":"What is your favorite color?"}]

input_ids = tokenizer.apply_chat_template(conv, return_tensors="pt", thinking=True, return_dict=True, add_generation_prompt=True).to(device)

set_seed(42)
output = model.generate(
    **input_ids,
    max_new_tokens=8,
)

prediction = tokenizer.decode(output[0, input_ids["input_ids"].shape[1]:], skip_special_tokens=True)
print(prediction)