Skip to content

Commit ec2ab42

Browse files
author
Awni Hannun
authored
Lower sorted QMM gather threshold (#2609)
1 parent 787c0d9 commit ec2ab42

1 file changed

Lines changed: 2 additions & 2 deletions

File tree

mlx/backend/metal/quantized.cpp

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -948,8 +948,8 @@ void GatherQMM::eval_gpu(const std::vector<array>& inputs, array& out) {
948948
// We are walking x in order and w is also in order so we can batch up the
949949
// matmuls and reuse reading x and w.
950950
//
951-
// TODO: Tune 16 and 8 here a bit better.
952-
if (M == 1 && B >= 16 && right_sorted_ == true && B / E >= 8) {
951+
// TODO: Tune 16 and 4 here a bit better.
952+
if (M == 1 && B >= 16 && right_sorted_ == true && B / E >= 4) {
953953
gather_qmm_rhs(
954954
x,
955955
w,

0 commit comments

Comments
 (0)