Missing attention_mask on hook

#44

by riedgar-ms - opened Apr 2

Apr 2

I'm attempting to use Phi-4 with the attention steering approach of PASTA. However, this is running into trouble because the hook set on the self-attention layer is not being passed an attention_mask; the argument is present when the hooked function is called, but it is set to None. Is this expected? The same hook on Phi-3 works fine.

gugarosa

Microsoft org 5 days ago

Hello @riedgar-ms !

Please reach out to the transformers team on GitHub. This repository only accounts for model weights and configuration, i.e., the source code comes directly from transformers.

gugarosa changed discussion status to closed 5 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment