Skip to content

Commit 8d69cf1

Browse files
Awni Hannunfaisalmemon
authored andcommitted
Lower sorted QMM gather threshold (ml-explore#2609)
1 parent 19f497b commit 8d69cf1

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

mlx/backend/metal/quantized.cpp

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -948,8 +948,8 @@ void GatherQMM::eval_gpu(const std::vector<array>& inputs, array& out) {
948948
// We are walking x in order and w is also in order so we can batch up the
949949
// matmuls and reuse reading x and w.
950950
//
951-
// TODO: Tune 16 and 8 here a bit better.
952-
if (M == 1 && B >= 16 && right_sorted_ == true && B / E >= 8) {
951+
// TODO: Tune 16 and 4 here a bit better.
952+
if (M == 1 && B >= 16 && right_sorted_ == true && B / E >= 4) {
953953
gather_qmm_rhs(
954954
x,
955955
w,

0 commit comments

Comments
 (0)