Instructions to use SakanaAI/Llama-3-Karamaru-v1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use SakanaAI/Llama-3-Karamaru-v1 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="SakanaAI/Llama-3-Karamaru-v1")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("SakanaAI/Llama-3-Karamaru-v1")
model = AutoModelForCausalLM.from_pretrained("SakanaAI/Llama-3-Karamaru-v1")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use SakanaAI/Llama-3-Karamaru-v1 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "SakanaAI/Llama-3-Karamaru-v1"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SakanaAI/Llama-3-Karamaru-v1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/SakanaAI/Llama-3-Karamaru-v1

SGLang

How to use SakanaAI/Llama-3-Karamaru-v1 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "SakanaAI/Llama-3-Karamaru-v1" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SakanaAI/Llama-3-Karamaru-v1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "SakanaAI/Llama-3-Karamaru-v1" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SakanaAI/Llama-3-Karamaru-v1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use SakanaAI/Llama-3-Karamaru-v1 with Docker Model Runner:
```
docker model run hf.co/SakanaAI/Llama-3-Karamaru-v1
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

からまる Llama-3-Karamaru-v1

Karamaru is a conversational AI model developed by Sakana AI that responds in the style of Edo-period Japanese. While the base language model was originally trained on modern text, we applied continual pretraining using a custom Edo-period dataset consisting of over 25 million characters. This dataset includes approximately 13 million characters of human-transcribed text and 12 million characters transcribed using AI-based kuzushiji OCR from historical Japanese books.

With Karamaru, users can ask questions in modern Japanese and receive answers written in the classical Japanese style of the Edo period, reflecting the worldview and cultural context of that era. Karamaru offers a unique way to explore and engage with Japan’s historical language and thought.

Karamaru is intended as a tool for research, education, and cultural exploration—bridging time and language to bring the past closer to the present.

For further information, please refer to our blog post.

Developed by: Sakana AI
License: Llama3 Community License
Finetuned from model : Llama-3-ELYZA-JP-8B
Demo: Karamaru v1 Demo

Usage

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
checkpoint = "SakanaAI/Llama-3-Karamaru-v1"

model = AutoModelForCausalLM.from_pretrained(checkpoint, device_map="auto", torch_dtype=torch.bfloat16)
model.eval()
tokenizer = AutoTokenizer.from_pretrained(checkpoint)

text = "AIにとって大事なものはなんですか。"
message = {"role": "user", "content": text}
conversation = [message]

input_ids = tokenizer.apply_chat_template(conversation, add_generation_prompt=True, return_tensors="pt")
    
input_ids = input_ids.to(model.device)
attention_mask = torch.ones_like(input_ids).to(model.device)

with torch.no_grad():
    output_ids = model.generate(
        input_ids,
        max_new_tokens=500,
        temperature=0.6,
        top_p=0.9,
        top_k=50,
        repetition_penalty=1.1,
        attention_mask=attention_mask
    )
output_ids = output_ids[0][input_ids.shape[1]:]
output = tokenizer.decode(output_ids, skip_special_tokens=True)
print(output)

Training Data

Karamaru was trained using a custom Edo-period text dataset totaling approximately 25 million characters.

Minna de Honkoku 12 millions characters.
Kuzushiji Dataset 1 million characters.
Pre-Modern Japanese Text Dataset 12 million characters using AI Kuzushiji OCR model RURI and using Sakana AI's LLM based classical Japanese OCR Refiner.

Limitations

Karamaru was trained on historical texts from the Edo period, which may reflect the social norms, values, and biases of that time. As a result, the model may generate responses that are considered inappropriate, outdated, or offensive by modern standards. Users should be mindful when using the model for research, educational or public-facing purposes.

Glossary

Edo period
A historical era in Japan spanning from 1603 to 1868, characterized by the rule of the Tokugawa shogunate, a strict social hierarchy, and flourishing traditional arts and culture.
Kuzushiji (くずし字)
A cursive style of Japanese writing used in historical texts, particularly before the Meiji 33rd year (1900). Kuzushiji includes Kanji and Hentaigana. It can be difficult to read without specialized training.

Developers

Tarin Clanuwat (Sakana AI)
Tianyu Zhao (Sakana AI)
Yuki Imajuku (Sakana AI)
Makoto Shing (Sakana AI)
Asanobu Kitamoto (National Institute of Informatics, ROIS-DS Center for Open Data in the Humanities, Sakana AI)

Collaborators

Kazuaki Yamamoto (National Institute of Japanese Literature)
Yuta Hashimoto (National Meseum of Japanese History)

Citation

BibTeX:

@misc{karamaruv1,
    url    = {https://SakanaAI/Llama-3-Karamaru-v1},
    title  = {Llama-3-Karamaru-v1},
    author = {Clanuwat, Tarin and Zhao, Tianyu and Imajuku, Yuki and Shing, Makoto and Kitamoto, Asanobu}
}

Downloads last month: 23

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for SakanaAI/Llama-3-Karamaru-v1

Base model

elyza/Llama-3-ELYZA-JP-8B

Finetuned

(12)

this model

Finetunes

3 models

Quantizations

8 models

SakanaAI
/

Llama-3-Karamaru-v1