Conversation
Srihari-mcw
left a comment
There was a problem hiding this comment.
Please address review comments. Thanks
|
|
||
| switch (src0->type) { | ||
| case GGML_TYPE_Q6_K: | ||
| ggml_compute_forward_get_rows_q6_Kx8(params, dst); |
There was a problem hiding this comment.
Isnt this ideal to be within AVX2 condition similar to Q4_0 ? - https://github.com/ggml-org/whisper.cpp/pull/3223/files
There was a problem hiding this comment.
And any reason if (src0->ne[1] % 8 == 0) is skipped? Although add it if only necessary
There was a problem hiding this comment.
I think that the block_q4_0x8 is one among the three variants and thats why the condition was necessary but if its better to add here also, maybe try adding it
| #undef MMID_MATRIX_ROW | ||
| } | ||
|
|
||
| void forward_get_rows(const ggml_compute_params * params, |
There was a problem hiding this comment.
Keep all params in same lines, follow same formatting as other functions in file
| const ggml_tensor * src0 = dst->src[0]; | ||
| const ggml_tensor * src1 = dst->src[1]; | ||
|
|
||
| GGML_TENSOR_BINARY_OP_LOCALS |
There was a problem hiding this comment.
Add a empty line like https://github.com/ggml-org/whisper.cpp/pull/3223/files
| } | ||
| } | ||
|
|
||
| static void ggml_compute_forward_get_rows_q6_Kx8( |
There was a problem hiding this comment.
Same comment here, pls follow same formatting as the other functions
| /** | ||
| * Dequantizes a single logical row from the repacked q6_Kx8 data format. | ||
| * | ||
| * @param p_repacked_blocks Pointer to the start of the 'block_q6_Kx8' structures for the entire row. |
There was a problem hiding this comment.
Please make sure that the variable names are uniform across all PRs
| float * GGML_RESTRICT y, | ||
| int64_t k, | ||
| int row_idx_in_group) { | ||
| assert(k % QK_K == 0); |
There was a problem hiding this comment.
Maybe a empty line after the function params would be nice, pls update formatting wherever necessary
| } | ||
| } | ||
|
|
||
| static inline int8_t read_scale_from_repacked(const uint8_t* ptr_repacked_scales, int row_idx_in_group, int scale_idx) { |
There was a problem hiding this comment.
Just add a line of comment explaining the helper overall? And can we keep these helpers overall at the start of the file only because some other helper AFAIK are also grouped together at the start
| y[l + 96] = d_super_block * sc3 * q4; | ||
| } | ||
| y += 128; | ||
| ptr_repacked_scales = (uint8_t *)current_block->scales + 64; |
|
|
||
| const uint8_t * ptr_ql_base = current_block->ql; | ||
| const uint8_t * ptr_qh_base = current_block->qh; | ||
| uint8_t * ptr_repacked_scales = (uint8_t *)current_block->scales; // 16*8 scales repacked - 2bytes of each super block stored together |
| const int8_t q3 = (int8_t)((ql_l0 >> 4) | (((qh_l >> 4) & 3) << 4)) - 32; | ||
| const int8_t q4 = (int8_t)((ql_l32 >> 4) | (((qh_l >> 6) & 3) << 4)) - 32; | ||
|
|
||
| y[l + 0] = d_super_block * sc0 * q1; |
|
|
||
| const int8_t q1 = (int8_t)((ql_l0 & 0xF) | (((qh_l >> 0) & 3) << 4)) - 32; | ||
| const int8_t q2 = (int8_t)((ql_l32 & 0xF) | (((qh_l >> 2) & 3) << 4)) - 32; | ||
| const int8_t q3 = (int8_t)((ql_l0 >> 4) | (((qh_l >> 4) & 3) << 4)) - 32; |
There was a problem hiding this comment.
Check whitespaces between expressions
No description provided.