Jim Lai's picture

Jim Lai PRO

grimjim

·

AI & ML interests

Experimenting primarily with 7B-12B parameter text completion models. Not all models are intended for direct end use, but aim for research and/or educational purposes. Recent Contributions: stabilized refusal direction ablation via Gram-Schmidt orthonormalization and norm-preserving interventions; confirmed reasoning transfer via model merger.

Recent Activity

published an article 1 day ago

ORBA: Orthogonal Reflection Bounded Ablation — A Geometrically Exact Detour in Directional Activation Editing

updated a model 2 days ago

grimjim/gemma-3-12b-it-orthogonal-reflection-bounded-ablation-v4-12B

published a model 2 days ago

grimjim/gemma-3-12b-it-orthogonal-reflection-bounded-ablation-v4-12B

View all activity

Organizations

Posts 28

Post

706

After tinkering with Gemma Scope 2, I now have an mechanistic explanation of why Winsorization was as effective as it was in my ablation experiments on Gemma 3 12B Instruct. In short, the activation for the BOS token overwhelms everything else. Gemma Scope 2 deliberately did not train on the BOS token. Winsorization capped the magnitude of the BOS token, allowing the activations of other tokens to be compared.
google/gemma-scope-2-12b-it

Articles 6

Article

3

ORBA: Orthogonal Reflection Bounded Ablation — A Geometrically Exact Detour in Directional Activation Editing

View all Articles

Collections 5

View 5 collections

models 195

grimjim/gemma-3-12b-it-orthogonal-reflection-bounded-ablation-v4-12B

Image-Text-to-Text • 12B • Updated 2 days ago • 15

grimjim/gemma-3-12b-it-orthogonal-reflection-bounded-ablation-v3-12B

Image-Text-to-Text • 12B • Updated 7 days ago • 17 • 1

grimjim/gemma-3-12b-it-orthogonal-rotation-bounded-ablation-v2-12B

Image-Text-to-Text • 12B • Updated 11 days ago • 17

grimjim/gemma-3-12b-it-orthogonal-rotation-bounded-ablation-v1-12B

Image-Text-to-Text • 12B • Updated 12 days ago • 22 • 1

grimjim/Equatorium-v3-12B

Text Generation • 12B • Updated Feb 23 • 30 • 2

grimjim/Equatorium-v2-12B

Text Generation • 12B • Updated Feb 10 • 4 • 1

grimjim/Equatorium-v1-12B

Text Generation • 12B • Updated Feb 5 • 3 • 1

grimjim/gemma-3-12b-it-MPOA-v2-12B

Image-Text-to-Text • 12B • Updated Jan 5 • 4 • 3

grimjim/Gemma-3-12B-Instruct-MPOA-v2-12B

grimjim/gemma-3-12b-it-norm-preserved-biprojected-abliterated

Image-Text-to-Text • 12B • Updated Jan 4 • 182 • 24

View 195 models

datasets 13

grimjim/llm-aes-writing-prompts-deduplicated-0.9-similarity

Viewer • Updated Jan 11 • 81.4k • 7

grimjim/PIAF-v1.2

Viewer • Updated Dec 8, 2025 • 611 • 23

grimjim/AILuminate-v1.0-demo-prompt-set-FR

Viewer • Updated Nov 21, 2025 • 1.2k • 4

grimjim/AILuminate-v1.0-demo-prompt-set-EN

Viewer • Updated Nov 21, 2025 • 1.2k • 6

grimjim/Magpie-Gemma2-Pro-Filtered-Deduped-Instruction

Viewer • Updated Sep 26, 2025 • 129k • 9

grimjim/tatsu-lab-alpaca-deduped-instruction

Viewer • Updated Sep 22, 2025 • 49.6k • 14

grimjim/nbeerbower-Purpura-KTO

Preview • Updated May 14, 2025 • 460 • 4

grimjim/nbeerbower-Arkhaios-KTO

Viewer • Updated May 14, 2025 • 444 • 5

grimjim/role_meta_info_multilingual

Preview • Updated May 7, 2025 • 46

grimjim/empatheticdialogues

Updated May 7, 2025 • 83

View 13 datasets