Skip to content

CUDA: Faster FlashAttention kernel#6374

Merged
ggerganov merged 10 commits intoggml-org:gg/flash-attnfrom
JohannesGaessler:jg/flash-attn-12
Apr 2, 2024
Merged

CUDA: Faster FlashAttention kernel#6374
ggerganov merged 10 commits intoggml-org:gg/flash-attnfrom
JohannesGaessler:jg/flash-attn-12

Commits

Commits on Mar 29, 2024

Commits on Mar 30, 2024

Commits on Mar 31, 2024

Commits on Apr 1, 2024

Commits on Apr 2, 2024