LoRA Adapter for SAE Introspection

This is a LoRA (Low-Rank Adaptation) adapter trained for SAE (Sparse Autoencoder) introspection tasks.

Base Model

  • Base Model: google/gemma-2-9b-it
  • Adapter Type: LoRA
  • Task: SAE Feature Introspection

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load base model and tokenizer
base_model = AutoModelForCausalLM.from_pretrained("google/gemma-2-9b-it")
tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-9b-it")

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "thejaminator/gemma-hook-layer-0")

Training Details

This adapter was trained using the lightweight SAE introspection training script to help the model understand and explain SAE features through activation steering.

Downloads last month
13
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for thejaminator/gemma-hook-layer-0

Base model

google/gemma-2-9b
Adapter
(164)
this model