Skip to content
Merged
Show file tree
Hide file tree
Changes from 21 commits
Commits
Show all changes
68 commits
Select commit Hold shift + click to select a range
7be0b44
Compress kvcache work
micmelesse Jun 11, 2024
356d243
fix causal. use cache_seqlens
micmelesse Jun 26, 2024
3e3dfc1
clean and test what works
micmelesse Jun 26, 2024
5a3cb0d
some configs work on new_kv but fails on 1,8
micmelesse Jun 27, 2024
e611433
cache overwrite correct
micmelesse Jun 28, 2024
737b701
new_kv works more or less
micmelesse Jun 28, 2024
52eb402
test local
micmelesse Jun 28, 2024
619c9ad
work on paged kv attention
micmelesse Jul 1, 2024
2d49406
prefill paged attention
micmelesse Jul 1, 2024
e5d13ef
fix has_batch_idx and skip local and rotatary emb
micmelesse Jul 2, 2024
6aa7caf
save
micmelesse Jul 2, 2024
d46e730
save
micmelesse Jul 2, 2024
63eb390
save
micmelesse Jul 3, 2024
f0193a7
save
micmelesse Jul 3, 2024
0e5223c
handle new_kv when paged kv cache
micmelesse Jul 8, 2024
962dc8a
all except has_batch_idx works
micmelesse Jul 9, 2024
b334464
major options are green
micmelesse Jul 10, 2024
0f3091c
test all
micmelesse Jul 10, 2024
c5be670
add tests
micmelesse Jul 10, 2024
10fd70b
save
micmelesse Jul 10, 2024
4c10a6b
clean up
micmelesse Jul 10, 2024
753093d
minor clean up
micmelesse Jul 10, 2024
431fd7a
simplest config
micmelesse Jul 10, 2024
70fce1e
save debug true
micmelesse Jul 10, 2024
3d73e88
save
micmelesse Jul 10, 2024
8a393f6
refactor slightly
micmelesse Jul 11, 2024
f681687
save work
micmelesse Jul 12, 2024
f6a546f
need key masking
micmelesse Jul 12, 2024
4f44741
force hip
micmelesse Jul 16, 2024
ed1cbcc
use is_hip
micmelesse Jul 17, 2024
1d46be3
save
micmelesse Jul 17, 2024
4802a37
fix cache_seq_len issue
micmelesse Jul 17, 2024
cd4617d
work on new_kv
micmelesse Jul 17, 2024
3f2b171
pass new_kv data
micmelesse Jul 18, 2024
475153f
save
micmelesse Jul 18, 2024
9864bc2
benchmark fwd only
micmelesse Jul 19, 2024
4d5faad
disable debug
micmelesse Jul 19, 2024
e76c5fb
pandas pdf
micmelesse Jul 19, 2024
9e7ce16
save
micmelesse Jul 19, 2024
4d1eeeb
set methods
micmelesse Jul 19, 2024
3a0cf22
record number of heads
micmelesse Jul 23, 2024
cd6bb74
use configs
micmelesse Jul 23, 2024
088fbc7
flexiable dim, n-heads, headofdim
micmelesse Jul 23, 2024
5ea6525
better benchmarking
micmelesse Jul 24, 2024
af3f4ee
basic inplace update working
micmelesse Jul 25, 2024
9d52279
works upto 64
micmelesse Jul 25, 2024
e0595e9
new_kv supported!
micmelesse Jul 25, 2024
8072212
test case for has_batch_idx
micmelesse Jul 25, 2024
62fdb92
has_batch_idx works!
micmelesse Jul 25, 2024
5f64128
save
micmelesse Jul 25, 2024
0bd947f
save
micmelesse Jul 25, 2024
75b5076
save
micmelesse Jul 26, 2024
23d08f1
save ref
micmelesse Jul 29, 2024
efefa81
fix mqa and gqa by duplicating
micmelesse Jul 30, 2024
77fc391
GQA and MQA working by kernel modifications
micmelesse Jul 30, 2024
8ea183b
fix new_kv with gqa
micmelesse Jul 30, 2024
5edf575
cache index
micmelesse Jul 30, 2024
f4f476d
deal with nans on fwd_splitk
micmelesse Jul 31, 2024
0e58a7c
save
micmelesse Jul 31, 2024
076f5fe
causal working on basic case
micmelesse Aug 1, 2024
1defaaf
causal works!
micmelesse Aug 1, 2024
9004132
alibi works!
micmelesse Aug 1, 2024
4b795dd
clean up
micmelesse Aug 1, 2024
ad6413c
clean prefill changes
micmelesse Aug 1, 2024
6415d9a
remove bwd stuff
micmelesse Aug 2, 2024
485ba55
limit decode test to test_op_fwd
micmelesse Aug 2, 2024
6b6e533
add ref
micmelesse Aug 5, 2024
e081f43
use bfloat
micmelesse Aug 5, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 9 additions & 3 deletions .github/workflows/amd_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,13 @@ jobs:
- name: Build
run: |
python setup.py install
- name: Test
- name: AMD Kernel Tests
run: |
pytest tests/test_flash_attn.py::test_flash_attn_output
pytest tests/test_flash_attn.py::test_flash_attn_varlen_output
# pytest flash_attn/flash_attn_triton_kernel_amd.py
# pytest flash_attn/flash_attn_triton_decode_amd.py
echo "skipped for now"
- name: Flash Attention Tests
run: |
# pytest tests/test_flash_attn.py::test_flash_attn_output
# pytest tests/test_flash_attn.py::test_flash_attn_varlen_output
pytest tests/test_flash_attn.py::test_flash_attn_kvcache
Loading