Review: "Implement DRY penalty" by p-e-w · Pull Request #645 · EricLBuehler/mistral.rs

p-e-w · 2024-07-29T04:56:51Z

Review of #637. Intended to be read commit-by-commit.

Clippy's suggestion cannot be implemented because of borrowing issues

Interesting that Clippy doesn't catch this

It's nicer when the length is not hardcoded

No need to leak this as it's not used elsewhere

Avoids quadratic runtime and potential DoS with adversarial inputs Ref oobabooga/text-generation-webui#6047

Most tokenizers encode punctuation tokens differently depending on where they occur in the input, and which tokens surround them. With the default sequence breakers, the appropriate encoding usually corresponds to the encoding produced when the token occurs after a word, rather than by itself. To emulate this, prefix the token with "a" before encoding, and extract the final token of the result. See LostRuins/koboldcpp#982 for a correct solution to this problem.

github-actions · 2024-07-29T04:57:57Z

Code Metrics Report

  ===============================================================================
 Language            Files        Lines         Code     Comments       Blanks
===============================================================================
 C Header                2           35           28            0            7
 Dockerfile              1           34           25            0            9
 Happy                   1          442          369            0           73
 JSON                   11          102          101            0            1
 Python                 41         1586         1368           46          172
 TOML                   19          563          497           11           55
-------------------------------------------------------------------------------
 Jupyter Notebooks       2            0            0            0            0
 |- Markdown             2           77           32           31           14
 |- Python               2          196          169            1           26
 (Total)                            273          201           32           40
-------------------------------------------------------------------------------
 Markdown               24         1822            0         1372          450
 |- BASH                 5          101           98            0            3
 |- JSON                 1           12           12            0            0
 |- Python               5           92           82            0           10
 |- Rust                 6          402          359           19           24
 |- TOML                 2           75           63            0           12
 (Total)                           2504          614         1391          499
-------------------------------------------------------------------------------
 Rust                  168        55066        50099          950         4017
 |- Markdown            89          837           13          775           49
 (Total)                          55903        50112         1725         4066
===============================================================================
 Total                 270        59650        52487         2379         4784
===============================================================================

EricLBuehler · 2024-07-29T09:42:12Z

Thanks for the review!

@p-e-w

* Implement dry penalty * Add dry sampling params to requests * Handle it * Clippy * Review: "Implement DRY penalty" (#645) * Silence bogus Clippy warning Clippy's suggestion cannot be implemented because of borrowing issues * Get rid of unnecessary type annotations Interesting that Clippy doesn't catch this * Store default sequence breakers in a slice It's nicer when the length is not hardcoded * Make default sequence breakers private No need to leak this as it's not used elsewhere * Limit match length Avoids quadratic runtime and potential DoS with adversarial inputs Ref oobabooga/text-generation-webui#6047 * "Fix" sequence breaker tokenization Most tokenizers encode punctuation tokens differently depending on where they occur in the input, and which tokens surround them. With the default sequence breakers, the appropriate encoding usually corresponds to the encoding produced when the token occurs after a word, rather than by itself. To emulate this, prefix the token with "a" before encoding, and extract the final token of the result. See LostRuins/koboldcpp#982 for a correct solution to this problem. * Nicer * Even better * Complete merge * Fix saturating sub * Handle when no context * Make context the entire sequence and refactor * Remove slicing for all * Fix the bug with penalty Credit to @p-e-w for finding this! Co-authored-by: Philipp Emanuel Weidmann <pew@worldwidemann.com> * Add custom logits processor API (#702) * Add custom logits processor api * Typos * Nicer interface and update example * Fix doctest * Update docs * Update exports * Add Gemma 2 PagedAttention support (#704) * Add gemma2 paged attn support * Non cuda support? * Remove error * It works * Faster RmsNorm in gemma/gemma2 (#703) * Fix bug in metal isq (#706) * Support GGUF BF16 tensors (#691) * Support GGUF bf16 tensors * Fix loading of bf16 ggml tensor * Fix dequant of bf16 * Use merged rev * Softcapping, real batching + sliding window support for Flash Attention (#707) * Flash attention varlen kind of works * Seems to work * Now it's nice * Sliding window support and clippy * Remove warning * Support smollm * Update rev to match merged * Remove some usages of 'pub' in models (#708) * Support the Phi 3.5 V model (#710) * Update image_seq_len * Update the examples * Format * Implement the Phi 3.5 MoE model (#709) * Copy the model * Add most of it * Add the blocksparse moe parts * Clippy * Fix mscales * A batch of fixes * Correctly cast it * Handle isq on gate * Even more progress * Runs now * Clippy * Fix to use layernorm * Remove unused * Add docs * Add more docs * Apply review comments * Update readme --------- Co-authored-by: Philipp Emanuel Weidmann <pew@worldwidemann.com>

p-e-w added 6 commits July 29, 2024 09:02

Silence bogus Clippy warning

fc0b39c

Clippy's suggestion cannot be implemented because of borrowing issues

Get rid of unnecessary type annotations

b5d0d04

Interesting that Clippy doesn't catch this

Store default sequence breakers in a slice

33b4163

It's nicer when the length is not hardcoded

Make default sequence breakers private

c68b3c8

No need to leak this as it's not used elsewhere

Limit match length

8177528

Avoids quadratic runtime and potential DoS with adversarial inputs Ref oobabooga/text-generation-webui#6047

p-e-w mentioned this pull request Jul 29, 2024

Implement DRY penalty #637

Merged

EricLBuehler merged commit f7d92d9 into EricLBuehler:dry_penalty Jul 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Review: "Implement DRY penalty"#645

Review: "Implement DRY penalty"#645
EricLBuehler merged 6 commits intoEricLBuehler:dry_penaltyfrom
p-e-w:dry-penalty-review

p-e-w commented Jul 29, 2024

Uh oh!

github-actions bot commented Jul 29, 2024

Uh oh!

EricLBuehler commented Jul 29, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

p-e-w commented Jul 29, 2024

Uh oh!

github-actions bot commented Jul 29, 2024

Uh oh!

EricLBuehler commented Jul 29, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants