kernel
File size: 394 Bytes
a743610
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
---
license: bsd-3-clause
tags:
  - kernel
---

# Flash Attention 3

Flash Attention is a fast and memory-efficient implementation of the
attention mechanism, designed to work with large models and long sequences.
This is a Hugging Face compliant kernel build of Flash Attention.

Original code here [https://github.com/Dao-AILab/flash-attention](https://github.com/Dao-AILab/flash-attention).