Update autotune configuration to avoid crash on AMD devices

#2
by ror HF Staff - opened

When running on an AMD device, trying to autotune with 32 warps causes a crash with RuntimeError: Triton Error [HIP]: Code: 1, Messsage: invalid argument. Thus we are removing that configuration when the device name contains "AMD", which is the case for MI250, MI300 and MI355. Tested on MI300.

Cannot merge
This branch has merge conflicts in the following files:
  • torch-ext/triton_layer_norm/layer_norm.py

Sign up or log in to comment