Missing attention_mask on hook

#44
by riedgar-ms - opened

I'm attempting to use Phi-4 with the attention steering approach of PASTA. However, this is running into trouble because the hook set on the self-attention layer is not being passed an attention_mask; the argument is present when the hooked function is called, but it is set to None. Is this expected? The same hook on Phi-3 works fine.

Microsoft org

Hello @riedgar-ms !

Please reach out to the transformers team on GitHub. This repository only accounts for model weights and configuration, i.e., the source code comes directly from transformers.

gugarosa changed discussion status to closed

Sign up or log in to comment