docs(local model routing): add docs on how to use Gemma for local model routing#21365
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces extensive documentation for an experimental feature in the Gemini CLI: local model routing using Gemma. The changes enable users to leverage locally-running Gemma models for routing decisions, potentially reducing costs and offering comparable latency to hosted models. The new documentation covers setup, configuration, and validation steps, integrating this new capability into the existing model routing and core documentation. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request adds documentation for the new experimental local model routing feature using Gemma. The changes are primarily in markdown files, introducing a new guide and updating existing ones. My review found a few critical and high-severity issues in the documentation that need to be addressed for accuracy and to prevent user confusion. Specifically, the model selection precedence has been updated incorrectly, a URL in the setup guide is broken, and the list of supported Gemma models is inconsistent with the implementation, which could lead to runtime errors for users following the guide.
Note: Security Review has been skipped due to the limited scope of the PR.
58eb42c to
bde3299
Compare
allenhutchison
left a comment
There was a problem hiding this comment.
A few nitpicks after running through this on a personal mac os machine (Tahoe 26.4). Otherwise everything worked great.
| [lit-macos-arm64](https://github.com/google-ai-edge/LiteRT-LM/releases/download/v0.9.0-alpha03/lit.macos_arm64). | ||
| 2. Ensure the binary is executable: `chmod a+x lit.macos_arm64` | ||
| 3. (Optional) Test starting the runtime: `./lit.macos_arm64 serve --verbose` | ||
|
|
There was a problem hiding this comment.
I ran this on a fresh mac OS device today and got tripped up by Mac OS security settings. By default mac os only allows binaries from "App Store and Known Developers" so when I tried to run the server it would fail with a message that offered to move it to the trash. I had to go to Settings -> Privacy & Security and click "Allow Anyway" unders "lit.macos_arm64" was blocked to protect your Mac.
There was a problem hiding this comment.
Added language to this effect. PTAL.
|
Also looks like you need to rerun the lint to pass the CI. |
mattKorwel
left a comment
There was a problem hiding this comment.
Tested across Mac, Windows and linux. 🎉 LGTM
|
Updated to address comments and reran the format. |
5abc170
…el routing (#21365) Co-authored-by: Douglas Reid <21148125+douglas-reid@users.noreply.github.com> Co-authored-by: Allen Hutchison <adh@google.com> Co-authored-by: matt korwel <matt.korwel@gmail.com>
…el routing (google-gemini#21365) Co-authored-by: Douglas Reid <21148125+douglas-reid@users.noreply.github.com> Co-authored-by: Allen Hutchison <adh@google.com> Co-authored-by: matt korwel <matt.korwel@gmail.com>
…el routing (google-gemini#21365) Co-authored-by: Douglas Reid <21148125+douglas-reid@users.noreply.github.com> Co-authored-by: Allen Hutchison <adh@google.com> Co-authored-by: matt korwel <matt.korwel@gmail.com>
Summary
Adds docs for the experimental feature of Gemma Model Routing.
Details
Related Issues
How to Validate
Pre-Merge Checklist