Conversation
|
Would an improvement be listing all of the new parameters? There's a lot to keep track of and your other option is to dump helps and sift through the regular mainline params too. |
Hey @Ph0rk0z, the changes I’m proposing here are in the README file that explains how to build and run the project in Docker/Podman. I’ll gather the most frequently used parameters and draft a PR that adds them to the main README. Could you review the draft and share your thoughts, especially on how to use splitting across GPUs? |
|
Sure if you have one like that, I can look through. I don't know if we need a separate parameters.md or something since those will change again I'm sure. |
I've just created parameters.md covering most commonly used parameters. It’s currently in a raw format, probably adding a table with a "Tips" column would make it more organized and actionable. Feel free to suggest any tweaks, additions, or structure you think would help! |
Build arguments could be expanded (or maybe it's own document). For example the build arguments I often use are
|
|
build arguments can be set one and done with ccmake. You fill it out graphically and save it/generate and then never have to paste long strings to build again.
I don't think we need to reinvent the wheel and document already normal mainline parameters. Just the NEW ones that ik has added which people might not know about. Everyone should be able to understand stuff like threads or model. Someone coming over from mainline might not know about -gr or the whole -cuda deal and they'd have to run through a bunch of PR comments to know. |
I don't agree at all. The whole point of documentation is to support any new user. Not all new users will have used mainline (I personally have helped guide people who used local LLM for the first time with ik_llama.cpp). It also would be much harder for a document to go over the differences and similarities with mainline since then you'd have to track both changes here and there and things change in mainline a lot and even if you went through the effort it still would be more confusing than to just document what exists here. Also @mcm007 A lot of the other sections (eg. server, sweep-bench, imatrix etc.) should probably be removed and just reference the existing documentation as that is more thorough. I know a lot of the existing documentation is not updated with the things you reference (if you want to touch those up, that would be also be nice). The quantization section is a weird one, not sure where exactly it should point (maybe a discussion, maybe just bullet points like your docker documentation) but as it stands now it would just probably confuse a new user. Overall, thank you for putting in effort. I definitely have been slacking from my previous goal of trying to keep the documentation here high quality and up to date, so it's really nice to have someone help. |
|
I understand that most users already have their own workflows and cheat sheets for building and running on their systems. My intent was to create an easy entry-point for new users (simple copy-paste commands to build and run) while still providing a place for experienced users to find what's possible to customize for their needs. I've thinking to something like this:
Unfortunately, I cannot express the notes as I really want, due to my limited understanding. |
|
You can check the current status here. |
There's existing docs here, right? They cover things like what threads does. Nothing besides outputing the help message covers what -ger does. If it's just a rehash of the help message, what's the point? Extra mile would be adding a flag and the PR it came from so people can read that quickly too. They could quickly get up to speed why they should do --cuda fusion 1 or enabling P2P etc. Dumping 2 pages of parameters on someone.. they're not going to read. |
Good idea, WIP. |
@saood06, if there are any improvements, glad to adapt.