Skip to content

feat(sqlite-vec): enable keyword search for sqlite-vec#1439

Merged
franciscojavierarceo merged 3 commits into
ogx-ai:mainfrom
varshaprasad96:support/sqlite-fts
May 21, 2025
Merged

feat(sqlite-vec): enable keyword search for sqlite-vec#1439
franciscojavierarceo merged 3 commits into
ogx-ai:mainfrom
varshaprasad96:support/sqlite-fts

Conversation

@varshaprasad96
Copy link
Copy Markdown
Contributor

What does this PR do?

This PR introduces support for keyword based FTS5 search with BM25 relevance scoring. It makes changes to the existing EmbeddingIndex base class in order to support a search_mode and query_str parameter, that can be used for keyword based search implementations.

Test Plan

run

pytest llama_stack/providers/tests/vector_io/test_sqlite_vec.py -v -s --tb=short --disable-warnings --asyncio-mode=auto

Output:

pytest llama_stack/providers/tests/vector_io/test_sqlite_vec.py -v -s --tb=short --disable-warnings --asyncio-mode=auto
/Users/vnarsing/miniconda3/envs/stack-client/lib/python3.10/site-packages/pytest_asyncio/plugin.py:207: PytestDeprecationWarning: The configuration option "asyncio_default_fixture_loop_scope" is unset.
The event loop scope for asynchronous fixtures will default to the fixture caching scope. Future versions of pytest-asyncio will default the loop scope for asynchronous fixtures to function scope. Set the default fixture loop scope explicitly in order to avoid unexpected behavior in the future. Valid fixture loop scopes are: "function", "class", "module", "package", "session"

  warnings.warn(PytestDeprecationWarning(_DEFAULT_FIXTURE_LOOP_SCOPE_UNSET))
====================================================== test session starts =======================================================
platform darwin -- Python 3.10.16, pytest-8.3.4, pluggy-1.5.0 -- /Users/vnarsing/miniconda3/envs/stack-client/bin/python
cachedir: .pytest_cache
metadata: {'Python': '3.10.16', 'Platform': 'macOS-14.7.4-arm64-arm-64bit', 'Packages': {'pytest': '8.3.4', 'pluggy': '1.5.0'}, 'Plugins': {'html': '4.1.1', 'metadata': '3.1.1', 'asyncio': '0.25.3', 'anyio': '4.8.0'}}
rootdir: /Users/vnarsing/go/src/github/meta-llama/llama-stack
configfile: pyproject.toml
plugins: html-4.1.1, metadata-3.1.1, asyncio-0.25.3, anyio-4.8.0
asyncio: mode=auto, asyncio_default_fixture_loop_scope=None
collected 7 items                                                                                                                

llama_stack/providers/tests/vector_io/test_sqlite_vec.py::test_add_chunks PASSED
llama_stack/providers/tests/vector_io/test_sqlite_vec.py::test_query_chunks_vector PASSED
llama_stack/providers/tests/vector_io/test_sqlite_vec.py::test_query_chunks_fts PASSED
llama_stack/providers/tests/vector_io/test_sqlite_vec.py::test_chunk_id_conflict PASSED
llama_stack/providers/tests/vector_io/test_sqlite_vec.py::test_register_vector_db PASSED
llama_stack/providers/tests/vector_io/test_sqlite_vec.py::test_unregister_vector_db PASSED
llama_stack/providers/tests/vector_io/test_sqlite_vec.py::test_generate_chunk_id PASSED

For reference, with the implementation, the fts table looks like below:

Chunk ID: 9fbc39ce-c729-64a2-260f-c5ec9bb2a33e, Content: Sentence 0 from document 0
Chunk ID: 94062914-3e23-44cf-1e50-9e25821ba882, Content: Sentence 1 from document 0
Chunk ID: e6cfd559-4641-33ba-6ce1-7038226495eb, Content: Sentence 2 from document 0
Chunk ID: 1383af9b-f1f0-f417-4de5-65fe9456cc20, Content: Sentence 3 from document 0
Chunk ID: 2db19b1a-de14-353b-f4e1-085e8463361c, Content: Sentence 4 from document 0
Chunk ID: 9faf986a-f028-7714-068a-1c795e8f2598, Content: Sentence 5 from document 0
Chunk ID: ef593ead-5a4a-392f-7ad8-471a50f033e8, Content: Sentence 6 from document 0
Chunk ID: e161950f-021f-7300-4d05-3166738b94cf, Content: Sentence 7 from document 0
Chunk ID: 90610fc4-67c1-e740-f043-709c5978867a, Content: Sentence 8 from document 0
Chunk ID: 97712879-6fff-98ad-0558-e9f42e6b81d3, Content: Sentence 9 from document 0
Chunk ID: aea70411-51df-61ba-d2f0-cb2b5972c210, Content: Sentence 0 from document 1
Chunk ID: b678a463-7b84-92b8-abb2-27e9a1977e3c, Content: Sentence 1 from document 1
Chunk ID: 27bd63da-909c-1606-a109-75bdb9479882, Content: Sentence 2 from document 1
Chunk ID: a2ad49ad-f9be-5372-e0c7-7b0221d0b53e, Content: Sentence 3 from document 1
Chunk ID: cac53bcd-1965-082a-c0f4-ceee7323fc70, Content: Sentence 4 from document 1

Query results:
Result 1: Sentence 5 from document 0
Result 2: Sentence 5 from document 1
Result 3: Sentence 5 from document 2

@facebook-github-bot
Copy link
Copy Markdown

Hi @varshaprasad96!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

@varshaprasad96
Copy link
Copy Markdown
Contributor Author

cc: @franciscojavierarceo

@varshaprasad96
Copy link
Copy Markdown
Contributor Author

I've signed the CLA, it may take sometime to reflect.

Comment thread llama_stack/providers/inline/vector_io/sqlite_vec/sqlite_vec.py Outdated
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Mar 6, 2025
Copy link
Copy Markdown
Collaborator

@franciscojavierarceo franciscojavierarceo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work Varsha! Could you also update the docs in this PR as well?

Comment thread llama_stack/providers/inline/vector_io/sqlite_vec/sqlite_vec.py Outdated
Comment thread llama_stack/providers/inline/vector_io/sqlite_vec/sqlite_vec.py Outdated
Comment thread llama_stack/providers/inline/vector_io/sqlite_vec/sqlite_vec.py Outdated
Comment thread llama_stack/providers/inline/vector_io/sqlite_vec/sqlite_vec.py Outdated
Comment thread tests/unit/providers/vector_io/test_sqlite_vec.py
Comment thread llama_stack/providers/utils/memory/vector_store.py Outdated

async def query(self, embedding: NDArray, k: int, score_threshold: float) -> QueryChunksResponse:
async def query(
self, embedding: NDArray, k: int, score_threshold: float, query_str: None, search_mode: None
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we instead introduce a new type of provider like "keyword_search_io"? Presumably there can be other full text search providers like elasticsearch or postgres ts_query that can be added in addition to sqlite-fts5.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@raghotham maybe we should discuss this in an RFC?

I personally think that vector, keyword, and standard primary key lookup are all retrieval strategies for a database. I would worry making three separate providers may introduce too much of an abstraction.

That is largely how we've implemented things in @feast-dev (the sqlite example: feast-dev/feast#5082), which I had pointed @varshaprasad96 to prior to doing the implementation. Of course, that may not be how we want to design these things here, so definitely open to your thoughts here. I just wanted to be transparent with how I saw things.

@franciscojavierarceo
Copy link
Copy Markdown
Collaborator

@varshaprasad96 we should definitely draft an RFC here. Looking at LangChain, they tend to refer to things slightly differently and I kind of like their semantics.

They refer to text search as lexical search to be more generic (ie.., BM25 , TF-IDF) and their Hybrid retrieval is called "Ensemble" retrieval. I think highlighting how other popular frameworks name things is helpful for us choosing something that's explicit, elegant, and intuitive for users.

@varshaprasad96
Copy link
Copy Markdown
Contributor Author

@franciscojavierarceo @raghotham Thanks for the inputs!

As Francisco mentioned, the goal of this approach was to provide users with the flexibility to choose between different retrieval strategies—Vector-based, Keyword-based, or Hybrid—within the RAG pipeline. The level of support for these approaches varies across different databases; for example, ChromaDB does not fully support FTS. Given this, we felt it made more sense to integrate these options into the RAG pipeline rather than introduce a separate provider.

I think highlighting how other popular frameworks name things is helpful for us choosing something that's explicit, elegant, and intuitive for users.

Makes sense, we will get a RFC ready and open it up for review.

@varshaprasad96
Copy link
Copy Markdown
Contributor Author

/hold - this is blocked by #1530. There will also be a RFC open to finalise the design before merging the PR.

@leseb
Copy link
Copy Markdown
Collaborator

leseb commented Apr 1, 2025

@varshaprasad96 any update on this? I see #1530 has merged, do we have a RFC somewhere? Thanks!

@varshaprasad96
Copy link
Copy Markdown
Contributor Author

@leseb Yes. I'm at a conference (kubecon) this week, but it's very much in radar. RFC is ready, and I'll open it for review soon with an implementation PR that can be referenced as a PoC.

@varshaprasad96
Copy link
Copy Markdown
Contributor Author

#1944 -> RFC for review.

Comment thread llama_stack/providers/utils/memory/vector_store.py Outdated
Comment thread llama_stack/providers/utils/memory/vector_store.py Outdated
@hardikjshah
Copy link
Copy Markdown
Contributor

Thanks for working on this and apologies I could not get to this earlier.
I had one main comment on how it might be cleaner to split the query_* methods instead of a bag method for query.

It will allow for explicit NotImplementedError in the the various providers without changing existing methods.

@varshaprasad96
Copy link
Copy Markdown
Contributor Author

Just a status update - I'm following up on this, and will push up the changes by eod today / early tomorrow.

Comment thread docs/source/providers/vector_io/sqlite-vec.md Outdated
Comment thread llama_stack/providers/inline/vector_io/sqlite_vec/sqlite_vec.py Outdated
Comment thread tests/unit/providers/vector_io/test_sqlite_vec.py Outdated
Comment thread tests/unit/providers/vector_io/test_sqlite_vec.py Outdated
Copy link
Copy Markdown
Collaborator

@franciscojavierarceo franciscojavierarceo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you update the small nits I have?

otherwise lgtm!

Comment thread llama_stack/providers/inline/vector_io/sqlite_vec/sqlite_vec.py Outdated
Comment thread llama_stack/providers/remote/vector_io/chroma/chroma.py Outdated
Comment thread llama_stack/providers/remote/vector_io/milvus/milvus.py Outdated
Comment thread llama_stack/providers/remote/vector_io/pgvector/pgvector.py Outdated
Comment thread llama_stack/providers/remote/vector_io/qdrant/qdrant.py Outdated
Comment thread llama_stack/providers/remote/vector_io/weaviate/weaviate.py Outdated
Comment thread llama_stack/providers/utils/memory/vector_store.py Outdated
Copy link
Copy Markdown
Collaborator

@franciscojavierarceo franciscojavierarceo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. You'll need to rebase my latest changes to RagQueryConfig
  2. Can you add the new parameter in RagQueryConfig to the docstring?
  3. Can you also regenerate the API documentation using uv run --with ".[dev]" ./docs/openapi_generator/run_openapi_generator.sh?

That plus some nits and then I think this lgtm!

@varshaprasad96 varshaprasad96 force-pushed the support/sqlite-fts branch 2 times, most recently from 0cc6501 to c2dd17a Compare May 17, 2025 15:16
Copy link
Copy Markdown
Collaborator

@franciscojavierarceo franciscojavierarceo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for this, lgtm!

Comment thread llama_stack/apis/tools/rag_tool.py Outdated
Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com>
Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com>
Copy link
Copy Markdown
Contributor

@hardikjshah hardikjshah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small nits but not blocking, lets land this.

Comment thread llama_stack/providers/inline/vector_io/sqlite_vec/sqlite_vec.py Outdated
Comment thread llama_stack/providers/inline/vector_io/sqlite_vec/sqlite_vec.py Outdated
Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com>
@franciscojavierarceo franciscojavierarceo merged commit e92301f into ogx-ai:main May 21, 2025
24 of 25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants