Skip to content

Feature/valkey vector store#20889

Closed
daric93 wants to merge 11 commits intorun-llama:mainfrom
daric93:feature/valkey-vector-store
Closed

Feature/valkey vector store#20889
daric93 wants to merge 11 commits intorun-llama:mainfrom
daric93:feature/valkey-vector-store

Conversation

@daric93
Copy link
Copy Markdown

@daric93 daric93 commented Mar 5, 2026

Description

Adds a new vector store integration for Valkey, an open-source key-value datastore that supports high-performance vector similarity search through the valkey-search module.

  • Uses valkey-glide (async) and valkey-glide-sync (sync) official clients
  • Mirrors Redis vector store API for seamless migration
  • Three initialization patterns: URL only, sync client only, async client only
  • Lazy async client creation for optimal resource usage

Fixes #20785

Dependencies

Required

  • valkey-glide>=2.2.7 - Official async Valkey client
  • valkey-glide-sync>=2.1.0 - Official sync Valkey client
  • llama-index-core>=0.13.0,<0.15 - LlamaIndex core functionality

Development

Standard LlamaIndex dev dependencies (pytest, mypy, ruff, etc.)

Motivation and Context

Valkey is gaining significant traction as an open-source alternative to Redis. Several factors make this integration timely and valuable:

  • Open Source Governance: As a Linux Foundation project, Valkey offers transparent governance and community-driven development
  • Redis Compatibility: Valkey maintains protocol compatibility for core features with Redis while adding new features, making migration straightforward
  • Performance: The valkey-search module delivers single-digit millisecond latency with 99%+ recall for vector search operations
  • Growing Adoption: Major cloud providers are offering managed Valkey services with vector search capabilities

New Package?

Did I fill in the tool.llamahub section in the pyproject.toml and provide a detailed README.md for my new integration or package?

  • Yes
  • No

Version Bump?

It is a new integration.

Type of Change

New feature (non-breaking change which adds functionality)

How Has This Been Tested?

I added new unit and integration tests to cover this change. Manual testing.

Suggested Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added Google Colab support for the newly added notebooks.
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I ran uv run make format; uv run make lint to appease the lint gods

@review-notebook-app
Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@dosubot dosubot Bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Mar 5, 2026
@daric93 daric93 changed the title Feature/valkey vector store draft:Feature/valkey vector store Mar 5, 2026
@daric93 daric93 changed the title draft:Feature/valkey vector store Feature/valkey vector store Mar 5, 2026
@daric93 daric93 marked this pull request as draft March 5, 2026 17:52
@daric93
Copy link
Copy Markdown
Author

daric93 commented Mar 5, 2026

As discussed in #20785 , this PR uses TEXT field support which is currently in RC (Release Candidate) in Valkey-Search, not yet GA.

The implementation is complete and ready for review, but I'm keeping this as a draft until TEXT support reaches GA (estimated March 2026). I'll mark it ready for merge once the official release is available.

In the meantime, feel free to review the implementation and provide any feedback. Happy to address comments while we wait for the GA release.

Copy link
Copy Markdown
Member

@AstraBert AstraBert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, thanks for doing this! I added some comments related to questions/things I would like to change :)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is auto-generated from our docs build, no need to modify it

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

Comment on lines +78 to +98
class TokenEscaper:
"""
Escape punctuation within an input string. Taken from RedisOM Python.
"""

# Keep for compatibility. Characters that RediSearch requires us to escape during queries.
# Source: https://redis.io/docs/stack/search/reference/escaping/#the-rules-of-text-field-tokenization
DEFAULT_ESCAPED_CHARS = r"[,.<>{}\[\]\\\"\':;!@#$%^&*()\-+=~\/ ]"

def __init__(self, escape_chars_re: Optional[Pattern] = None):
if escape_chars_re:
self.escaped_chars_re = escape_chars_re
else:
self.escaped_chars_re = re.compile(self.DEFAULT_ESCAPED_CHARS)

def escape(self, value: str) -> str:
def escape_symbol(match: re.Match) -> str:
value = match.group(0)
return f"\\{value}"

return self.escaped_chars_re.sub(escape_symbol, value)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably implement this as a function rather than a class

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed completely. Valkey's filter implementation handles special characters. Added tests to verify.

Comment on lines +197 to +199
url_parts = valkey_url.replace("valkey://", "").split(":")
host = url_parts[0] if len(url_parts) > 0 else "localhost"
port = int(url_parts[1]) if len(url_parts) > 1 else 6379
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like using something like urllib would make the logic easier for parsing the URL here

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to use urllib


# Create sync client immediately (synchronous operation)
if not valkey_client:
from glide_sync import (
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imports should be placed at the top

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved

Comment on lines +242 to +247
@property
def client(
self,
) -> SyncGlideClient | SyncGlideClusterClient | GlideClient | GlideClusterClient:
"""Return the valkey client instance."""
return self._valkey_client or self._valkey_client_async
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally use two different properties, one for the sync client (client) and one for the asynchronous one (aclient)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

created separate properties

Comment on lines +249 to +258
def _ensure_sync_client(self) -> None:
"""
Ensure sync client is available.

Raises:
Exception: If sync client is not available.

"""
if not self._valkey_client:
raise ValkeyVectorStoreError("No sync client available")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the spirit of the lazily initialization comment above, I would initialize the sync client here once (pretty much as you do for the async client)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

VECTOR_FIELD_NAME: str = "vector"


class ValkeyIndexInfo:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be a dataclass

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed. to dataclass


GLIDE_AVAILABLE = True
except ImportError:
GLIDE_AVAILABLE = False
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should always be available as I am guessing is part of the valkey-glide and valkey-glide-sync required dependencies in the pyproject.toml(?)
If it is not available, I would consider adding it as a required dependency, or, if you do not want overhead for the package, you could just add it as a dev dependency to be bundled only with tests

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines +69 to +70
[tool.uv.sources]
llama-index-core = {path = "../../../llama-index-core", editable = true}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this might mess up some things when someone tries to install the package, I would remove it

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

pass


@pytest.mark.integration
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any configuration related to the integration marker here for pytest, and I am not sure our CI would automatically skip them, so I think you might need to add it in the pytest.ini config in the pyproject file

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

@daric93
Copy link
Copy Markdown
Author

daric93 commented Mar 13, 2026

@AstraBert Thanks for the review! All comments have been addressed — ready for a second look when you get a chance.

daric93 added 11 commits March 18, 2026 17:10
Signed-off-by: Daria Korenieva <daric2612@gmail.com>
Signed-off-by: Daria Korenieva <daric2612@gmail.com>
Signed-off-by: Daria Korenieva <daric2612@gmail.com>
Signed-off-by: Daria Korenieva <daric2612@gmail.com>
Signed-off-by: Daria Korenieva <daric2612@gmail.com>
@daric93 daric93 force-pushed the feature/valkey-vector-store branch from cdf9f14 to c6b795b Compare March 19, 2026 00:10
@daric93 daric93 marked this pull request as ready for review March 19, 2026 00:11
@daric93
Copy link
Copy Markdown
Author

daric93 commented Mar 19, 2026

The valkey-search module with text search support is now GA (v1.2.0).

@logan-markewich
Copy link
Copy Markdown
Collaborator

Sorry, right now we are pausing contributions that contribute net-new packages. I appreciate the contribution but going to close this out.

@daric93
Copy link
Copy Markdown
Author

daric93 commented Mar 25, 2026

Thanks for the review and for letting me know. I understand the pause on new packages. If the policy changes in the future, I'd be happy to rebase and resubmit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request]: Add Valkey Vector Store support

3 participants