https://research.google/blog/turboquant-redefining-ai-efficiency-with-extreme-compression/ Someone already trying it with promising ram saving: https://www.reddit.com/r/LocalLLaMA/comments/1s36vnk/looking_for_feedback_porting_googles_turboquant/