-
Notifications
You must be signed in to change notification settings - Fork 282
Add explicit column transform so we can convert back #3434
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add explicit column transform so we can convert back #3434
Conversation
Implements a new `columnMapper` API that handles both encoding (TS → DB)
and decoding (DB → TS) of column names. This solves the problem of using
camelCase column names in TypeScript while querying snake_case database
columns.
Key features:
- `snakeCamelMapper()` - Built-in snake_case ↔ camelCase converter
- `createColumnMapper(mapping)` - Custom column name mapping
- Automatic encoding of WHERE clauses and ORDER BY clauses
- Backward compatible - existing `transformer` API still works
The new API deprecates `transformer` in favor of bidirectional mapping
that works for both query results and filter parameters.
Example usage:
```typescript
import { snakeCamelMapper } from '@electric-sql/client'
const stream = new ShapeStream({
url: 'http://localhost:3000/v1/shape',
params: { table: 'todos' },
columnMapper: snakeCamelMapper()
})
// Now you can filter with camelCase column names:
// WHERE userId = $1 → automatically encoded to → WHERE user_id = $1
```
Addresses Discord feature request for reverse-transform/encode function
to complement the existing transformer prop for TanStack integration.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3434 +/- ##
==========================================
- Coverage 69.54% 67.86% -1.68%
==========================================
Files 182 184 +2
Lines 9778 9934 +156
Branches 353 404 +51
==========================================
- Hits 6800 6742 -58
- Misses 2976 3190 +214
Partials 2 2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@KyleAMathews what's the status of this PR? iirc If we keep both |
kevin-dp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should deprecate transformer and the current snake to camelcase mapper has some problems with corner cases.
| if (params.where) { | ||
| // Apply column name encoding if columnMapper is provided | ||
| const encodedWhere = this.options.columnMapper | ||
| ? encodeWhereClause(params.where, this.options.columnMapper.encode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Type error here because encodeWhereClause expects the whereClause to be a string but the params.where argument is of type string | Record<string, string>.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, instead of writing a ternary here, i would rather modify encodeWhereClause such that it allows the encoder to be undefined.
If it's undefined it returns the where clause without any changes, so we essentially move that ternary logic inside the encodeWhereClause function.
So it becomes:
const encodedWhere = encodeWhereClause(params.where, this.options.columnMapper?.encode)| if (subsetParams.where) { | ||
| // Apply column name encoding if columnMapper is provided | ||
| const encodedWhere = this.options.columnMapper | ||
| ? encodeWhereClause(subsetParams.where, this.options.columnMapper.encode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would remove the ternary if we move the logic inside encodeWhereClause
| if (subsetParams.orderBy) { | ||
| // Apply column name encoding to ORDER BY clause if columnMapper is provided | ||
| const encodedOrderBy = this.options.columnMapper | ||
| ? encodeWhereClause( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here with the ternary
| * Transform a row from database format to application format (decode). | ||
| * Applied to data received from Electric. | ||
| */ | ||
| decode: (row: Row<Extensions>) => Row<Extensions> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function conflates the decoding of the column name with the old transformer function we had.
Perhaps best to keep the old transformer and have it separate from column mapping.
So decode would then be of type (dbColumnName: string) => string.
We could define some aliases to make this even more clear on the types:
type DbColumnName = string
type AppColumnName = string
decode: (dbColName: DbColumnName) => AppColumnName
encode: (appColName: AppColumnName) => DbColumnName| * snakeToCamel('project_id') // 'projectId' | ||
| * snakeToCamel('created_at') // 'createdAt' | ||
| */ | ||
| export function snakeToCamel(str: string): string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a bunch of corner cases that are not handled by this function: leading underscores, trailing underscores, multiple underscores, segments starting with uppercase.
snakeToCamel(`user_id`)
'userId' // ok
snakeToCamel(`_user`)
'User' // wrong
snakeToCamel(`user_`)
'user_' // should we drop trailing underscores?
snakeToCamel(`user_Id`)
'user_Id' // wrong
snakeToCamel(`user__id`)
'user_Id' // wrongI asked chatgpt how these corner cases are usually handled, this is what it replied:
🐍➡️🐫 Snake Case to Camel Case — Recommended Rules
1. Preserve leading underscores
Leading underscores often carry semantic meaning (e.g., privacy indicators).
Always keep them exactly as they are.
_test → _test
__my_var → __myVar
2. Drop trailing underscores
Trailing underscores rarely encode meaning and typically cause ambiguity, so remove them during conversion.
test_ → test
test__ → test
3. Split on underscores and normalize segments
Break the string on underscores.
Ignore empty segments (caused by multiple underscores).
Normalize each segment to lowercase before camelization.
test_column → testColumn
test__column → testColumn
4. Camelize by lowercasing the first segment and capitalizing subsequent ones
After normalization:
- First segment → lowercase
- All other segments →
Capitalize
test_column_name → testColumnName
5. Normalize segments that start with uppercase
Uppercase snake segments should be treated the same as lowercase ones: normalize to lowercase first.
test_Column → testColumn
test_XML_value → testXmlValue
| * camelToSnake('projectId') // 'project_id' | ||
| * camelToSnake('createdAt') // 'created_at' | ||
| */ | ||
| export function camelToSnake(str: string): string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A basic correctness criteria is: camelToSnake(snakeToCamel(str)) = str. We should add that to the tests.
| * Optional reverse mapping for debugging/introspection. | ||
| * Maps application column names to database column names. | ||
| */ | ||
| mapping?: Record<string, string> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mapping should not be in the ColumnMapper type. The ColumnMapper consists of the encode and decode functions.
If we want to create a column mapper from an explicit map of column names, then we can use the createColumnMapper function which will turn that map into an encoder and decoder function.
But that map itself should not be part of this interface because it is not used by the client.ts (which relies solely on the encode and decode functions).
| * | ||
| * @internal | ||
| */ | ||
| export function encodeWhereClause( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may transform literal values if they are not passed as parameters, e.g.:
encodeWhereClause("user_id = 'user_id'", snakeToCamel)
// returns "userId = 'userId'"
// instead of: "userId = 'user_id'"There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another tricky problem is if users have $ signs in their column names and provide a custom encoder to handle it. That would probably not work because we ignore matches that start with $.
| } from '../src/column-mapper' | ||
| import type { Schema } from '../src/types' | ||
|
|
||
| describe(`snakeToCamel`, () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should also have a describe block that checks that camelToSnake(snakeToCamel(str)) = str
| }) | ||
|
|
||
| it(`should handle multiple capital letters`, () => { | ||
| expect(camelToSnake(`userProfileImageURL`)).toBe(`user_profile_image_u_r_l`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is clearly wrong, nobody would name their column user_profile_image_u_r_l
|
Don't forget to add a changeset. |
- Remove deprecation tag from transformer - clarify it's still useful for value transformations like encryption, while columnMapper is better for column name transformations - Document execution order when both transformer and columnMapper are used - Fix type error by handling Record<string, string> case for where clause - Update encodeWhereClause to accept undefined encoder for optional chaining - Fix snakeToCamel edge cases: - Preserve leading underscores (_user_id → _userId) - Drop trailing underscores (user_id_ → userId) - Collapse multiple underscores (user__id → userId) - Normalize mixed case to lowercase (user_Column → userColumn) - Fix camelToSnake to handle acronyms properly: - userID → user_id (not user_i_d) - userHTTPSURL → user_https_url - parseHTMLString → parse_html_string - Fix encodeWhereClause to not transform quoted string literals - Now preserves 'user_id' as-is in WHERE clauses - Handles escaped quotes correctly (O''Brien) - Remove mapping from ColumnMapper interface, keep only in createColumnMapper return type - Add roundtrip tests for snake_case ↔ camelCase conversions - Add comprehensive tests for quoted strings and edge cases - Add changeset Co-authored-by: kevin-dp
commit: |
**Bug Fix:** - Fixed bug where columnMapper and transformer couldn't be used together - Now properly chains: columnMapper.decode → transformer - Added test demonstrating the chaining behavior **Documentation Improvements:** - Clarified that ColumnMapper only transforms column NAMES, not values/types - Added prominent warnings about limitations and edge cases: - WHERE clause encoding uses regex (may not handle all complex SQL) - Acronym ambiguity in roundtrips (userID → user_id → userId) - When to use explicit mapping vs automatic - Added examples showing when to fall back to database column names - Clarified parser is for type conversion, columnMapper is for names only This addresses the question: "do we document that the automatic transformers have edge cases where it doesn't work & that you might need to do it manually?" Yes, we now clearly document: 1. WHERE clause encoding limitations 2. Acronym handling quirks 3. When to use explicit mapping 4. How to fall back to database column names in WHERE clauses 5. That columnMapper only handles names, not type conversions
- Remove "Legacy" label from transformer documentation - Emphasize transformer is for VALUE transformations (e.g., decryption, computed fields) - Clarify columnMapper is for COLUMN NAME transformations - Run eslint --fix on column-mapper.ts
✅ Deploy Preview for electric-next ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
- Format client.ts, column-mapper.ts, and test file - Fix line wrapping and parentheses for consistency
- Fix type incompatibility in chaining test - Change row parameter type from Record<string, unknown> to Record<string, string> - Add type assertion for mapper.decode result to match transformer input
The previous tests had expectations that are impossible to meet: - All-uppercase sequences (like HTTPSURL) can't be split without knowing where acronym boundaries are - the function needs lowercase letters as hints - Single-letter segments (a_b_c → aBC → a_bc) lose information in the roundtrip Changes: - Split acronym tests into three cases: 1. Typical acronyms with boundaries (userHTTPSUrl → user_https_url) 2. Additional boundary examples (HTTPSConnection → https_connection) 3. All-uppercase sequences documented as expected behavior (userHTTPSURL → user_httpsurl) - Separate roundtrip test into: 1. Cases that work (user_id, project_id, etc.) 2. Known limitation documented (a_b_c → aBC → a_bc) This matches the actual correct behavior of the camelToSnake function.
Based on external code review feedback, implemented three key improvements:
1. **Preserve trailing underscores in snakeToCamel**
- Makes the function truly invertible for round-trip safety
- `user_id_` → `userId_` → `user_id_` now works correctly
- Added test for trailing underscore round-trip
2. **Add type checks for subset.where and orderBy**
- Defensive programming to ensure we only encode string values
- Consistent with existing type check for params.where
3. **Skip double-quoted identifiers in encodeWhereClause**
- Postgres uses double quotes for case-sensitive identifiers
- `"userId"` and `"User"."createdAt"` are now preserved
- Handles escaped double quotes ("") correctly
- Added comprehensive tests for double-quoted identifiers
These changes improve correctness without adding unnecessary complexity.
Skipped more complex suggestions (dollar-quotes, qualified name handling)
as they're overkill for Electric's simpler WHERE clause use case.
|
@KyleAMathews could you summarize the changes you made in response to my review? Would be useful to know the state of this PR without having to look into all the details of the latest commits. |
|
@kevin-dp claudes asleep — but basically fixed the edge cases you identified (and a few more) & added back-and-forth tests for snake <--> camel case. |
So, do we agree on keeping |
The ColumnMapper interface was incorrectly parameterized with Extensions, even though it only renames column keys and doesn't care about value types. Changes: - Simplified ColumnMapper interface to use Record<string, unknown> - Removed Extensions type parameter from createColumnMapper and snakeCamelMapper - Removed redundant "Only renames columns" comment - the types make this obvious - Added type assertions in client.ts where needed for transformer compatibility This makes the API more accurate - columnMapper works with any value types because it only transforms keys, not values.
Per Kevin's feedback, the ColumnMapper interface should only handle column name transformations (string → string), not row transformations. Changes: - `decode: (dbColumnName: string) => string` - transforms DB column names to app column names - `encode: (appColumnName: string) => string` - transforms app column names to DB column names - Updated createColumnMapper() to return simple string transformers - Updated snakeCamelMapper() to return simple string transformers - Updated client.ts to iterate over row keys and apply decode() manually - Updated all tests to use string → string interface - Removed unnecessary `mapping` property from createColumnMapper return type This is a cleaner, more focused API that doesn't assume how the mapper will be used. The row transformation logic now lives in the client code where it's actually needed.
Based on external code review, fixed two issues:
1. **Documentation inconsistency in snakeToCamel**
- Doc said "Drops trailing underscores" but code preserves them
- Fixed doc to match implementation: "Preserves trailing underscores"
- Added example: `snakeToCamel('user_id_') // 'userId_'`
- This is the correct behavior for round-trip safety
2. **Add NULLS, FIRST, LAST keywords to skip-set**
- Handles ORDER BY clauses like: `createdAt DESC NULLS LAST`
- Also added FIRST and LAST for completeness
- Added comprehensive tests for NULLS FIRST/LAST in ORDER BY
These fixes ensure the column mapper handles standard Postgres ORDER BY
syntax correctly while maintaining accurate documentation.
| * mapper.encode('userId') // 'user_id' | ||
| * ``` | ||
| */ | ||
| export interface ColumnMapper { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@claude i would introduce type aliases here to make more clear which string we expect.
type DbColumnName = string and type AppColumnName = string and use those aliases in encode and decode below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Claude is being lazy... i did it myself
Implements a new
columnMapperAPI that handles both encoding (TS → DB) and decoding (DB → TS) of column names. This solves the problem of using camelCase column names in TypeScript while querying snake_case database columns.Key features:
snakeCamelMapper()- Built-in snake_case ↔ camelCase convertercreateColumnMapper(mapping)- Custom column name mappingtransformerAPI still worksThe new API deprecates
transformerin favor of bidirectional mapping that works for both query results and filter parameters.Example usage:
Addresses Discord feature request for reverse-transform/encode function to complement the existing transformer prop for TanStack integration.