Skip to content

Conversation

@codetheweb
Copy link
Contributor

@codetheweb codetheweb commented Oct 21, 2025

Description of changes

/auth/identity currently is returning duplicate databases which causes database resolution to fail.

Test plan

How are these changes tested?

  • Tests pass locally with pytest for python, yarn test for js, cargo test for rust

Migration plan

Are there any migrations, or any forwards/backwards compatibility changes needed in order to make sure this change deploys reliably?

Observability plan

What is the plan to instrument and monitor this change?

Documentation Changes

Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the docs section?

@github-actions
Copy link

Reviewer Checklist

Please leverage this checklist to ensure your code review is thorough before approving

Testing, Bugs, Errors, Logs, Documentation

  • Can you think of any use case in which the code does not behave as intended? Have they been tested?
  • Can you think of any inputs or external events that could break the code? Is user input validated and safe? Have they been tested?
  • If appropriate, are there adequate property based tests?
  • If appropriate, are there adequate unit tests?
  • Should any logging, debugging, tracing information be added or removed?
  • Are error messages user-friendly?
  • Have all documentation changes needed been made?
  • Have all non-obvious changes been commented?

System Compatibility

  • Are there any potential impacts on other parts of the system or backward compatibility?
  • Does this change intersect with any items on our roadmap, and if so, is there a plan for fitting them together?

Quality

  • Is this code of a unexpectedly high quality (Readability, Modularity, Intuitiveness)

@codetheweb codetheweb marked this pull request as ready for review October 21, 2025 01:34
@codetheweb codetheweb enabled auto-merge (squash) October 21, 2025 01:35
@propel-code-bot
Copy link
Contributor

propel-code-bot bot commented Oct 21, 2025

Fix for Duplicate Databases in Rust Client /auth/identity API Handling

This pull request addresses an issue where the /auth/identity endpoint could return duplicate database entries, causing failures in database resolution within the Rust client. The data structure for the databases field in the GetUserIdentityResponse struct is changed from Vec<String> to HashSet<String> throughout the relevant modules, enforcing uniqueness and preventing duplicate database names at the type level. Associated logic is updated to use the set correctly, and a minor method signature tweak ensures the first (but now unordered) database is selected as needed.

Key Changes

• Changed databases field in GetUserIdentityResponse (file: rust/api-types/src/user_identity.rs) from Vec<String> to HashSet<String>.
• Replaced initialization of databases with a HashSet<String> in rust/frontend/src/auth/mod.rs.
• Updated code in rust/chroma/src/client/chroma_http_client.rs to iterate over the HashSet with into_iter().next() instead of relying on Vec::first().

Affected Areas

rust/api-types/src/user_identity.rs
rust/frontend/src/auth/mod.rs
rust/chroma/src/client/chroma_http_client.rs

This summary was automatically generated by @propel-code-bot

pub user_id: String,
pub tenant: String,
pub databases: Vec<String>,
pub databases: HashSet<String>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[BestPractice]

This change from Vec<String> to HashSet<String> is indeed a breaking change for consumers of the api-types crate. While this is a good solution to deduplicate database names, it requires careful handling as a public API change.

According to Rust's semantic versioning guidelines, this requires a major version bump (e.g., 1.0.0 → 2.0.0) since it changes the public API in an incompatible way. Consider:

  1. Adding a migration guide in the changelog
  2. Documenting the behavioral differences (no order guarantee, automatic deduplication)
  3. Potentially providing helper methods for common Vec↔HashSet conversions

This follows RFC 1105 API Evolution guidelines for major breaking changes.

Context for Agents
[**BestPractice**]

This change from `Vec<String>` to `HashSet<String>` is indeed a breaking change for consumers of the `api-types` crate. While this is a good solution to deduplicate database names, it requires careful handling as a public API change.

According to Rust's semantic versioning guidelines, this requires a major version bump (e.g., 1.0.0 → 2.0.0) since it changes the public API in an incompatible way. Consider:
1. Adding a migration guide in the changelog
2. Documenting the behavioral differences (no order guarantee, automatic deduplication)
3. Potentially providing helper methods for common Vec↔HashSet conversions

This follows RFC 1105 API Evolution guidelines for major breaking changes.

File: rust/api-types/src/user_identity.rs
Line: 9

@codetheweb codetheweb force-pushed the feat-fix-database-resolution branch from d28918b to 0f63693 Compare October 21, 2025 17:05
@codetheweb codetheweb merged commit 45c5e06 into main Oct 21, 2025
61 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants