Replies: 1 comment
-
|
Hello @MikeyBeez, Thanks for starting this discussion! When dealing with AI/LLM integrations, Vector DBs, or agent frameworks, quirks like this can usually be traced back to a few specific moving parts:
If you are still blocked, providing a minimal reproducible snippet or logging the raw request/response payload (scrubbed of secrets) usually helps pinpoint the exact failure layer much faster. Hope this helps point you in the right direction. Let me know if you make any progress! |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Sharing a design paper that proposes treating context management as an operating system concern — directly relevant to how LlamaIndex handles retrieval and context assembly.
Core argument: Context selection, not context length, is the dominant factor in reasoning quality. The paper proposes a two-agent architecture:
The thread-based retrieval approach is a direct contrast to embedding-based chunk retrieval: instead of finding semantically similar fragments, you retrieve the full reasoning chain within a topic. The paper argues this preserves context that chunk retrieval loses.
Paper and PDF: github.com/MikeyBeez/fuzzyOS
DOI: 10.5281/zenodo.18571717
Interested in thoughts from people building retrieval and context assembly systems.
Beta Was this translation helpful? Give feedback.
All reactions