Replies: 1 comment
-
|
Perfect summary |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
LLM summary for my questions regarding LEANN :
Based on the LEANN paper, here is the detailed breakdown of how they achieve massive storage reduction without sacrificing accuracy, and how they address the specific constraints of semantic searching.
To understand this, we have to look at the "Storage vs. Compute" trade-off and their specific Two-Level Search algorithm.
1. How they achieve "No Accuracy Loss" with Low Storage
The "magic" of LEANN is that it changes where the embedding comes from, not what the embedding is.
The Traditional Approach (e.g., HNSW, Chroma, Pinecone):
The LEANN Approach:
Conclusion: They don't sacrifice accuracy because they are mathematically generating the exact same high-dimensional vectors during the search that a traditional database would have stored on the hard drive. They pay for this with compute time (latency), not accuracy.
2. Addressing Your Concern: Semantic Search & "Accuracy"
You asked: "When they use semantic searching then shouldnt it naturally hit accuracy because similarity is not equal to accurate chunk... unless they don't use graph rag?"
This is a crucial distinction. In this paper, "Accuracy" refers to Retrieval Recall (Did we find the specific vectors the model thinks are best?), not necessarily "Truth" (Did the model understand the universe?).
However, LEANN actually mitigates the "fuzziness" of semantic search better than standard compression methods (like PQ) through a Two-Level Search strategy:
The Problem with Compression (Standard PQ)
To save space, many DBs compress vectors (Quantization). This makes the vectors "fuzzy."
LEANN's Two-Level Solution
LEANN uses a hybrid approach to ensure the final result is precise:
Why this fixes the "Semantic" issue: By re-computing the exact vector at the final step, LEANN filters out the "hallucinations" or errors caused by compression. It ensures that the final chunks returned are mathematically the closest match according to the embedding model.
Note on GraphRAG: LEANN is not a Knowledge Graph (GraphRAG). It is a vector index. However, because it achieves the same Recall (90%+) as a full-size HNSW index, it is "accurate" relative to the underlying model (like Contriever). If the embedding model is good, LEANN is good.
3. Storage Reduction Details: How is it 50x smaller?
LEANN attacks the three main sources of bloat in a vector database:
A. Eliminating Vector Storage (The biggest win)
B. Pruning the Graph Metadata (High-Degree Preserving)
C. Efficient "Soft" Index Building
Summary Table
So, is all correct or it missed something ? Btw thankyou very much for this great project devs.
Beta Was this translation helpful? Give feedback.
All reactions