-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Tinybird migration #2924
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tinybird migration #2924
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
WalkthroughAdds a parallel Tinybird migration path: a new Tinybird client using TINYBIRD_API_KEY_NEW and duplicated/background ingest calls (via waitUntil or parallel fetch) across many ingestion helpers; introduces many Tinybird datasource/pipe/materialization files and replaces several callers to a higher-level recordLink(...) API for deletions. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Caller
participant Wrapper as recordEvent (wrapper)
participant TBNew as Tinybird tbNew
participant TB as Tinybird tb
Caller->>Wrapper: recordEvent(payload)
activate Wrapper
note right of Wrapper #eef7ff: schedule background migration write via waitUntil
Wrapper-->>TBNew: waitUntil(recordEventTBNew(payload))
Wrapper->>TB: await recordEventTB(payload)
TB-->>Wrapper: primary result
deactivate Wrapper
Wrapper-->>Caller: result
sequenceDiagram
autonumber
actor Service
participant ClickHandler as recordClick
participant TB as Tinybird (API_KEY)
participant TB2 as Tinybird (API_KEY_NEW)
Service->>ClickHandler: post NDJSON clicks
activate ClickHandler
par parallel writes
ClickHandler->>TB: POST /v0/events (NDJSON) -- primary
ClickHandler->>TB2: POST /v0/events (NDJSON) -- parallel (tbNew)
and
TB-->>ClickHandler: 2xx/err
TB2-->>ClickHandler: 2xx/err
end
deactivate ClickHandler
sequenceDiagram
autonumber
actor Client
participant Track as trackLead
participant Wrapper as recordLead (wrapper)
Client->>Track: trackLead(mode, payload)
activate Track
alt mode !== "deferred"
Track->>Wrapper: waitUntil / schedule recordLead
Wrapper-->>Track: scheduled
else deferred
Track-->>Client: skip lead recording
end
deactivate Track
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 47
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
packages/tinybird/pipes/v2_events.pipe (1)
143-159: Quote metadata keys before callingJSONExtractString.
JSONExtractString(metadata, {{ metadataKey }})expands toJSONExtractString(metadata, campaign)for a filter likemetadata.campaign. Because Tinybird templating doesn’t auto-quote strings, ClickHouse treatscampaignas an identifier and the query fails. Please wrap the key (and, as needed, the value) withString(...)so all metadata filters compile.-{% set metadataKey = item.get('operand', '').split('.')[1] %} +{% set metadataKey = item.get('operand', '').split('.', 1)[1] %} ... - AND JSONExtractString(metadata, {{ metadataKey }}) = {{ value }} + AND JSONExtractString(metadata, {{ String(metadataKey) }}) = {{ value }}Apply the same fix to every metadata filter block in this file so multi-node queries don’t break.
Also applies to: 219-236, 306-325
packages/tinybird/pipes/v2_cities.pipe (1)
39-107: Region filter no longer matches returned values.You now emit
regionasCONCAT(country, '-', region), but theAND region = {{ region }}checks (and the corresponding ones in the other nodes) still compare against the rawregioncolumn. Any caller that selects a region from the response will send back the newcountry-regionstring and get zero rows. Please either keep returning the raw region or switch the filters to compare against the same concatenated expression everywhere (and ideally accept both forms during the transition). For example:- {% if defined(region) %} AND region = {{ region }} {% end %} + {% if defined(region) %} + AND ( + region = {{ region }} + OR CONCAT(country, '-', region) = {{ region }} + ) + {% end %}The same fix needs to be applied to
cities_leads,cities_sales, andcities_sales_with_typeto restore correct filtering.packages/tinybird/pipes/v2_top_urls.pipe (1)
39-66: Restore linkIds filtering for clicks.
top_urls_clicksnever applies the newlinkIdsfilter, so any request that specifieslinkIds(the whole point of this change) will return unscoped click data. That breaks parity with the leads/sales nodes and produces incorrect analytics. Please add the samelinkIdshandling (and tighten the PREWHERE guard) here.- {% if not defined(linkId) and (defined(workspaceId) or defined(partnerId) or defined(programId)) %} + {% if not defined(linkId) and not defined(linkIds) and (defined(workspaceId) or defined(partnerId) or defined(programId)) %} PREWHERE link_id in (SELECT link_id from workspace_links) {% end %} WHERE url != '' - {% if defined(linkId) %} AND link_id = {{ String(linkId) }} {% end %} + {% if defined(linkIds) %} AND link_id IN {{ Array(linkIds, 'String') }} + {% elif defined(linkId) %} AND link_id = {{ String(linkId) }} + {% end %} {% if defined(qr) %} AND qr = {{ Boolean(qr) }} {% end %}
♻️ Duplicate comments (1)
packages/tinybird/endpoints/get_framer_lead_events.pipe (1)
10-11: Verify hardcoded default IDs are appropriate for production.Same concern as in
packages/tinybird/pipes/get_framer_lead_events.pipe: the default arrays contain hardcoded IDs that should be verified as test fixtures rather than production data.
🧹 Nitpick comments (20)
packages/tinybird/endpoints/get_import_error_logs.pipe (1)
7-31: Endpoint logic is correct and well-structured.The SQL query properly:
- Parameterizes both
workspaceIdandimportIdwithrequired=True- Orders by
timestamp DESCto return most recent logs first- Applies a reasonable limit of 5000 rows
The query will efficiently use the datasource's sorting key (
timestamp, workspace_id, import_id) for filtering and ordering.Optional refinement: Consider explicit column selection.
Using
SELECT *is acceptable for a logs endpoint where all columns are typically needed, but if only specific columns are required by consumers, an explicit column list would be more maintainable and slightly more efficient:- SELECT * + SELECT timestamp, workspace_id, import_id, source, entity, entity_id, code, message FROM dub_import_error_logspackages/tinybird/pipes/coordinates_all.pipe (1)
74-94: Consider adding a top-level LIMIT.The endpoint combines up to 100 rows from each of the three event types, potentially returning 300 rows total. If the consuming application expects a fixed maximum (e.g., 100 most recent events across all types), add
LIMIT 100after the finalORDER BY.packages/tinybird/pipes/dub_regular_links_metadata_pipe.pipe (2)
9-14: Consider externalizing the workspace exclusion list.The hardcoded workspace IDs reduce maintainability. If this list needs frequent updates, consider:
- Moving to a configuration table/datasource that can be joined
- Using a parameterized pipe if Tinybird supports it
- Documenting the process for adding/removing workspace IDs
10-13: Minor: Standardize comment spacing.The inline comments have inconsistent spacing after
--. Consider using consistent spacing for better readability.Apply this diff:
- 'cm05wnnpo000711ztj05wwdbu', -- buffer - 'ws_1JYEZKSFX7RMB5M9XP4G0CCAT', -- jobber - 'clawqph22020408jng5s9b2tp', -- cal - 'ws_1JQ7175XGRN2ZYE2QAPVA4A26' -- homeserve + 'cm05wnnpo000711ztj05wwdbu', -- buffer + 'ws_1JYEZKSFX7RMB5M9XP4G0CCAT', -- jobber + 'clawqph22020408jng5s9b2tp', -- cal + 'ws_1JQ7175XGRN2ZYE2QAPVA4A26' -- homeservepackages/tinybird/endpoints/get_lead_event_by_id.pipe (2)
11-23: Remove unnecessary ORDER BY clause.Since
event_idis described as "The unique event ID" (line 19), the query should return at most one row. TheORDER BY timestamp DESCclause is unnecessary for a single-row result and adds overhead.Apply this diff to remove the redundant ordering:
SELECT * FROM dub_lead_events WHERE event_id = {{ String( eventId, "nLvob3FdmsYuI1BI", description="The unique event ID.", required=True, ) }} - ORDER BY timestamp DESC
11-12: Consider explicit column selection.Using
SELECT *retrieves all columns fromdub_lead_events. For better performance and clarity, consider explicitly listing the columns needed by consumers of this endpoint.packages/tinybird/pipes/get_sale_event.pipe (1)
5-5: *Consider explicitly listing columns instead of SELECT .Using
SELECT *retrieves all columns fromdub_sale_events_id. While acceptable for simple get-by-id endpoints, explicitly listing required columns improves clarity, reduces payload size if not all fields are needed, and protects against schema changes introducing unwanted columns.If this endpoint only needs specific fields, apply a diff like:
- SELECT * + SELECT event_id, customer_id, amount, currency, created_at FROM dub_sale_events_idpackages/tinybird/endpoints/get_webhook_events.pipe (2)
16-24: Consider pagination and scalability.The hardcoded
LIMIT 100without pagination support may cause issues:
- Users cannot retrieve events beyond the first 100
- No offset or cursor mechanism for next pages
- No way to adjust page size based on use case
Consider adding pagination parameters:
= {{ String( webhookId, 'wh_fjs8YfXkgsFL7eF7LOBvbluDb', description="The ID of the webhook", required=True, ) }} ORDER BY timestamp DESC - limit 100 + LIMIT {{ Int32(limit, 100, description="Number of events to return") }} + {% if defined(offset) %} + OFFSET {{ Int32(offset, 0, description="Number of events to skip") }} + {% end %}
11-24: Optimize sorting key for webhook_id queries
Current sorting key is("timestamp", "webhook_id", "event_id"), soWHERE webhook_id = … ORDER BY timestampstill scans by timestamp first. Change the sorting key to(webhook_id, timestamp)or add a secondary index onwebhook_idto accelerate this query.packages/tinybird/pipes/v2_utms.pipe (1)
324-324: Prefertruefor consistency.Other nodes in this file use
WHERE true(lines 63, 131, 199, 272). Usingtrueinstead of1 = 1maintains consistency.- 1 = 1 + truepackages/tinybird/pipes/get_lead_event_by_id.pipe (1)
8-21: Add LIMIT 1 for single-record unique ID lookup.Since
event_idis described as "The unique event ID," this query should return at most one record. AddingLIMIT 1makes the intent explicit and guards against unexpected duplicates.Additionally,
ORDER BY timestamp DESCis unnecessary overhead for a unique ID lookup, though harmless if the timestamp ordering serves a fallback purpose.Apply this diff to add the limit:
WHERE event_id = {{ String( eventId, "nLvob3FdmsYuI1BI", description="The unique event ID.", required=True, ) }} ORDER BY timestamp DESC + LIMIT 1Optionally, remove the
ORDER BYclause if no fallback logic requires it:WHERE event_id = {{ String( eventId, "nLvob3FdmsYuI1BI", description="The unique event ID.", required=True, ) }} - ORDER BY timestamp DESC + LIMIT 1apps/web/lib/api/conversions/track-lead.ts (1)
235-238: Condition change expands lead recording to all non-deferred modes.The condition change from
if (mode === "async")toif (mode !== "deferred")means that lead recording viarecordLeadnow occurs for all modes except "deferred" (including "wait" mode). This aligns with the removal of synchronous lead recording from the "wait" path.While the logic appears correct for the migration architecture, consider adding error handling for the
recordLeadcall since failures insidewaitUntilwill fail silently and not affect the API response.Consider wrapping the
recordLeadcall with error handling and logging:if (mode !== "deferred") { - await recordLead(createLeadEventPayload(customer.id)); + try { + await recordLead(createLeadEventPayload(customer.id)); + } catch (error) { + console.error("[trackLead] Failed to record lead in Tinybird:", error); + // Optional: Send to error monitoring service + } }This ensures visibility into background ingestion failures while not blocking the API response.
packages/tinybird/endpoints/v2_triggers.pipe (2)
150-206: Consider removing redundant subquery wrapper.The subquery on lines 152-206 doesn't perform any transformation on the aggregated results—it simply selects all columns including the redundant
amount AS saleAmountalias (line 150). Unless this pattern is required for Tinybird pipe syntax or future extensibility, the subquery adds unnecessary complexity.If no transformation is needed, simplify by removing the subquery:
- SELECT trigger, sales, amount, amount AS saleAmount - FROM - ( - SELECT - CASE WHEN qr = 0 THEN 'link' WHEN qr = 1 THEN 'qr' END as trigger, - COUNT(qr) as sales, - sum(amount) as amount - FROM - dub_sale_events_mv - ... - GROUP BY qr - ORDER BY amount DESC - ) as subquery + SELECT + CASE WHEN qr = 0 THEN 'link' WHEN qr = 1 THEN 'qr' END as trigger, + COUNT(qr) as sales, + sum(amount) as amount, + sum(amount) AS saleAmount + FROM + dub_sale_events_mv + ... + GROUP BY qr + ORDER BY amount DESC
58-90: Significant filter logic duplication across nodes.The WHERE clause filtering logic (covering linkIds/linkId, qr, geography, device/browser/os, referer, UTM parameters, url, and timestamps) is duplicated across four nodes:
trigger_clicks,trigger_leads,trigger_sales, andtrigger_sales_with_type. This accounts for ~30-40 lines repeated in each node.If Tinybird supports includes, macros, or parameterized SQL fragments, consider extracting the common filter logic into a reusable component. This would improve maintainability and reduce the risk of inconsistencies (like the DateTime vs DateTime64 issue flagged above).
If Tinybird doesn't support such features for pipe definitions, you may want to document this duplication as a known limitation and ensure changes to filter logic are applied consistently across all four nodes.
Also applies to: 108-140, 164-203, 224-263
packages/tinybird/endpoints/v2_usage.pipe (2)
3-4: Improve endpoint description.The description "Timeseries data" is too generic. Consider making it more specific to help users understand what this endpoint provides.
DESCRIPTION > - Timeseries data + Daily usage metrics (events, links, revenue) aggregated by workspace
165-174: Consider documenting valid resource parameter values.The endpoint routes based on the
resourceparameter ('events','revenue', or defaults to links), but there's no visible parameter declaration or documentation for valid values. Consider adding a parameter declaration with allowed values for clarity.Add a parameter declaration before the endpoint node:
+NODE endpoint +PARAMETERS > + resource String = 'events' ( + description="The resource type: 'events', 'revenue', or 'links'", + allowed=['events', 'revenue', 'links'] + ) + NODE endpoint SQL >packages/tinybird/pipes/coordinates_sales.pipe (1)
22-22: Document the Ashburn city filter.The specific exclusion of "Ashburn" city (likely AWS us-east-1) is unexplained and may confuse future maintainers.
Add a comment explaining why Ashburn is filtered:
AND city != 'Unknown' - AND city != 'Ashburn' + AND city != 'Ashburn' -- Exclude AWS us-east-1 datacenter trafficpackages/tinybird/pipes/get_lead_events.pipe (1)
6-6: Consider using the materialized view for better performance.This pipe queries the raw
dub_lead_eventstable instead ofdub_lead_events_mv. Other similar pipes in this PR (e.g.,get_framer_lead_events.pipe) use the MV for improved query performance.Unless there's a specific reason to query raw data, consider using the MV:
SELECT * - FROM dub_lead_events + FROM dub_lead_events_mvpackages/tinybird/materializations/dub_regular_links_metadata_pipe.pipe (1)
1-19: LGTM! Consider documenting the exclusion criteria.The materialized view correctly filters out specific high-volume workspaces. The SQL logic is sound and the inline comments help identify the excluded workspaces.
Consider documenting why these specific workspaces are excluded (appears to be for performance/scale reasons) and establishing a process for maintaining this list as workspace characteristics change over time.
packages/tinybird/endpoints/all_stats.pipe (1)
8-11: LGTM! Consider documenting the baseline offset.The aggregation query correctly computes overall statistics from three materialized views. The use of
FINALondub_links_metadata_latestensures strongly consistent reads.The magic number
42036155added to the links count at line 10 appears to be a baseline offset for pre-existing links. Consider adding a comment explaining this offset, or extracting it as a named constant if it needs to be referenced elsewhere.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (88)
apps/web/app/(ee)/api/cron/framer/backfill-leads-batch/route.ts(1 hunks)apps/web/lib/api/audit-logs/record-audit-log.ts(3 hunks)apps/web/lib/api/conversions/track-lead.ts(2 hunks)apps/web/lib/tinybird/client.ts(1 hunks)apps/web/lib/tinybird/log-conversion-events.ts(2 hunks)apps/web/lib/tinybird/log-import-error.ts(1 hunks)apps/web/lib/tinybird/record-click-zod.ts(2 hunks)apps/web/lib/tinybird/record-click.ts(1 hunks)apps/web/lib/tinybird/record-lead.ts(1 hunks)apps/web/lib/tinybird/record-link.ts(3 hunks)apps/web/lib/tinybird/record-sale.ts(1 hunks)apps/web/lib/tinybird/record-webhook-event.ts(1 hunks)packages/tinybird/datasources/dub_audit_logs.datasource(1 hunks)packages/tinybird/datasources/dub_click_events.datasource(1 hunks)packages/tinybird/datasources/dub_click_events_id.datasource(1 hunks)packages/tinybird/datasources/dub_conversion_events_log.datasource(1 hunks)packages/tinybird/datasources/dub_import_error_logs.datasource(1 hunks)packages/tinybird/datasources/dub_lead_events.datasource(1 hunks)packages/tinybird/datasources/dub_links_metadata.datasource(1 hunks)packages/tinybird/datasources/dub_sale_events.datasource(1 hunks)packages/tinybird/datasources/dub_sale_events_id.datasource(1 hunks)packages/tinybird/datasources/dub_webhook_events.datasource(1 hunks)packages/tinybird/endpoints/all_stats.pipe(1 hunks)packages/tinybird/endpoints/coordinates_all.pipe(1 hunks)packages/tinybird/endpoints/coordinates_sales.pipe(1 hunks)packages/tinybird/endpoints/get_audit_logs.pipe(1 hunks)packages/tinybird/endpoints/get_click_event.pipe(1 hunks)packages/tinybird/endpoints/get_framer_lead_events.pipe(1 hunks)packages/tinybird/endpoints/get_import_error_logs.pipe(1 hunks)packages/tinybird/endpoints/get_lead_event.pipe(1 hunks)packages/tinybird/endpoints/get_lead_event_by_id.pipe(1 hunks)packages/tinybird/endpoints/get_lead_events.pipe(1 hunks)packages/tinybird/endpoints/get_sale_event.pipe(1 hunks)packages/tinybird/endpoints/get_webhook_events.pipe(1 hunks)packages/tinybird/endpoints/v2_browsers.pipe(1 hunks)packages/tinybird/endpoints/v2_cities.pipe(1 hunks)packages/tinybird/endpoints/v2_continents.pipe(1 hunks)packages/tinybird/endpoints/v2_count.pipe(1 hunks)packages/tinybird/endpoints/v2_countries.pipe(1 hunks)packages/tinybird/endpoints/v2_customer_events.pipe(1 hunks)packages/tinybird/endpoints/v2_devices.pipe(1 hunks)packages/tinybird/endpoints/v2_events.pipe(1 hunks)packages/tinybird/endpoints/v2_os.pipe(1 hunks)packages/tinybird/endpoints/v2_referer_urls.pipe(1 hunks)packages/tinybird/endpoints/v2_referers.pipe(1 hunks)packages/tinybird/endpoints/v2_regions.pipe(1 hunks)packages/tinybird/endpoints/v2_timeseries.pipe(1 hunks)packages/tinybird/endpoints/v2_top_links.pipe(1 hunks)packages/tinybird/endpoints/v2_top_partners.pipe(1 hunks)packages/tinybird/endpoints/v2_top_programs.pipe(1 hunks)packages/tinybird/endpoints/v2_top_tags.pipe(1 hunks)packages/tinybird/endpoints/v2_top_urls.pipe(1 hunks)packages/tinybird/endpoints/v2_triggers.pipe(1 hunks)packages/tinybird/endpoints/v2_usage.pipe(1 hunks)packages/tinybird/endpoints/v2_utms.pipe(1 hunks)packages/tinybird/materializations/dub_click_events_id_pipe.pipe(1 hunks)packages/tinybird/materializations/dub_click_events_pipe.pipe(1 hunks)packages/tinybird/materializations/dub_lead_events_pipe.pipe(1 hunks)packages/tinybird/materializations/dub_links_metadata_pipe.pipe(1 hunks)packages/tinybird/materializations/dub_regular_links_metadata_pipe.pipe(1 hunks)packages/tinybird/materializations/dub_sale_events_id_pipe.pipe(1 hunks)packages/tinybird/materializations/dub_sale_events_pipe.pipe(1 hunks)packages/tinybird/pipes/all_stats.pipe(1 hunks)packages/tinybird/pipes/coordinates_all.pipe(1 hunks)packages/tinybird/pipes/coordinates_sales.pipe(1 hunks)packages/tinybird/pipes/dub_click_events_pipe_with_domain_key.pipe(1 hunks)packages/tinybird/pipes/dub_regular_links_metadata_pipe.pipe(1 hunks)packages/tinybird/pipes/dub_sale_events_id_pipe.pipe(1 hunks)packages/tinybird/pipes/get_framer_lead_events.pipe(1 hunks)packages/tinybird/pipes/get_lead_event_by_id.pipe(1 hunks)packages/tinybird/pipes/get_lead_events.pipe(1 hunks)packages/tinybird/pipes/get_sale_event.pipe(1 hunks)packages/tinybird/pipes/v2_browsers.pipe(4 hunks)packages/tinybird/pipes/v2_cities.pipe(6 hunks)packages/tinybird/pipes/v2_continents.pipe(4 hunks)packages/tinybird/pipes/v2_count.pipe(7 hunks)packages/tinybird/pipes/v2_countries.pipe(4 hunks)packages/tinybird/pipes/v2_devices.pipe(5 hunks)packages/tinybird/pipes/v2_events.pipe(6 hunks)packages/tinybird/pipes/v2_os.pipe(4 hunks)packages/tinybird/pipes/v2_referer_urls.pipe(5 hunks)packages/tinybird/pipes/v2_referers.pipe(4 hunks)packages/tinybird/pipes/v2_regions.pipe(1 hunks)packages/tinybird/pipes/v2_timeseries.pipe(4 hunks)packages/tinybird/pipes/v2_top_links.pipe(5 hunks)packages/tinybird/pipes/v2_top_urls.pipe(4 hunks)packages/tinybird/pipes/v2_triggers.pipe(5 hunks)packages/tinybird/pipes/v2_utms.pipe(4 hunks)
🧰 Additional context used
🧬 Code graph analysis (8)
apps/web/lib/tinybird/record-link.ts (1)
apps/web/lib/tinybird/client.ts (1)
tbNew(9-12)
apps/web/lib/tinybird/log-conversion-events.ts (1)
apps/web/lib/tinybird/client.ts (2)
tb(3-6)tbNew(9-12)
apps/web/lib/tinybird/record-webhook-event.ts (2)
apps/web/lib/tinybird/client.ts (2)
tb(3-6)tbNew(9-12)apps/web/lib/zod/schemas/webhooks.ts (1)
webhookEventSchemaTB(37-50)
apps/web/lib/tinybird/record-lead.ts (2)
apps/web/lib/tinybird/client.ts (2)
tb(3-6)tbNew(9-12)apps/web/lib/zod/schemas/leads.ts (1)
leadEventSchemaTB(98-107)
apps/web/lib/api/audit-logs/record-audit-log.ts (2)
apps/web/lib/tinybird/client.ts (1)
tbNew(9-12)apps/web/lib/api/audit-logs/schemas.ts (1)
auditLogSchemaTB(15-29)
apps/web/lib/tinybird/log-import-error.ts (2)
apps/web/lib/tinybird/client.ts (2)
tb(3-6)tbNew(9-12)apps/web/lib/zod/schemas/import-error-log.ts (1)
importErrorLogSchema(3-23)
apps/web/lib/tinybird/record-sale.ts (2)
apps/web/lib/tinybird/client.ts (2)
tb(3-6)tbNew(9-12)apps/web/lib/zod/schemas/sales.ts (1)
saleEventSchemaTB(120-137)
apps/web/lib/tinybird/record-click-zod.ts (1)
apps/web/lib/tinybird/client.ts (2)
tb(3-6)tbNew(9-12)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (76)
packages/tinybird/datasources/dub_import_error_logs.datasource (1)
1-1: LGTM! Token addition aligns with migration pattern.Adding
TOKEN dub_tinybird_token APPENDis consistent with the parallel Tinybird ingestion path described in the PR objectives. TheAPPENDpermission is appropriate for a datasource receiving ingested events.Verify that
dub_tinybird_tokenis properly configured in the Tinybird workspace with the necessary permissions for this datasource.packages/tinybird/endpoints/get_import_error_logs.pipe (1)
1-1: LGTM! Correct READ permission for endpoint.Using
TOKEN dub_tinybird_token READis appropriate for an endpoint that queries the datasource without modifying data.packages/tinybird/pipes/coordinates_all.pipe (3)
17-17: Verify the 5-minute time window for clicks.The clicks query uses a 5-minute window while leads and sales use 12 hours. If clicks are more frequent and only recent events are needed for the map visualization, this is fine. Otherwise, consider whether the time windows should be consistent across event types.
24-43: LGTM!The leads data node follows the same pattern as clicks with an appropriate 12-hour window for less frequent events.
60-62: Clarify the commission calculation logic.The commission is computed using
timestamp % 3to pseudo-randomly select from rates [30%, 40%, 50%]. This appears to be placeholder logic for demo/visualization purposes rather than actual business rules.If this is temporary demo logic, add a TODO comment. If commission rates should come from product/plan data, replace this with the real calculation.
packages/tinybird/pipes/v2_referers.pipe (5)
50-57: LGTM! Filtering logic is correct.The updated filtering logic correctly handles both single and multiple link IDs:
- When
linkIdsis provided, it uses anINclause with array expansion- When only
linkIdis provided, it uses equality comparison- The PREWHERE optimization correctly skips the workspace_links subquery when specific link IDs are already provided
This pattern efficiently reduces the query scope when link IDs are known upfront.
101-108: LGTM! Consistent with referers_clicks.The filtering logic matches the pattern established in
referers_clicks, ensuring consistent behavior across different event types.
138-192: LGTM! Consistent filtering with improved structure.The node now wraps the sales aggregation in a subquery, which improves readability. The filtering logic for
linkIds/linkIdis consistent with other nodes, and all other WHERE conditions are preserved.
205-212: LGTM! Filtering correctly applied in CTE.The filtering logic is properly applied within the
salesCTE, ensuring that the downstream joins and aggregations (distinct_sales, min_timestamps) operate on the correctly filtered dataset.
1-306: Excellent implementation of multi-link filtering.The changes consistently implement support for filtering by multiple link IDs across all relevant query nodes. Key strengths:
- Consistent pattern: All nodes follow the same if/elif structure for
linkIds→linkId→ fallback- Query optimization: PREWHERE clauses correctly avoid the workspace_links subquery when specific link IDs are provided
- Backward compatibility: Existing single
linkIdbehavior is preserved- Clean implementation: The Array() expansion for IN clauses is the appropriate ClickHouse pattern
packages/tinybird/pipes/v2_timeseries.pipe (4)
133-141: LGTM! Dual-mode link filtering implemented correctly.The change correctly introduces array-based link filtering while maintaining backward compatibility with scalar linkId. The precedence logic is clear: linkIds takes priority, fallback to linkId, and PREWHERE optimization applies only when neither is defined.
209-217: LGTM! Consistent implementation across leads node.The dual-mode link filtering is correctly implemented, matching the pattern established in timeseries_clicks_data. Precedence and PREWHERE optimization logic are consistent.
278-286: LGTM! Consistent implementation across sales node.The dual-mode link filtering follows the same pattern as clicks and leads nodes. The implementation is correct and maintains consistency across all timeseries data nodes.
333-340: LGTM! Complete and consistent migration to dual-mode link filtering.The implementation is correct and consistent with the other timeseries nodes. All four data nodes (clicks, leads, sales, sales_with_type) now uniformly support array-based link filtering while maintaining backward compatibility with scalar linkId filtering.
apps/web/lib/api/audit-logs/record-audit-log.ts (2)
2-4: LGTM!The import additions are correct.
tbNewprovides the new Tinybird client for parallel ingestion, andwaitUntilfrom@vercel/functionsenables background task scheduling on Vercel's platform.
82-87: LGTM!The new endpoint definition correctly mirrors the existing
recordAuditLogTBconfiguration, using the same schema and datasource name. Thewait: trueflag ensures ingestion completes before the promise resolves, which is appropriate for background tasks that need completion guarantees.The TODO comment clearly marks this as temporary migration scaffolding.
packages/tinybird/pipes/dub_regular_links_metadata_pipe.pipe (1)
6-14: Externalize workspace filter and verify mappings
- Hardcoded workspace IDs are magic values; consider moving them to named constants or config.
- Confirm each excluded ID (buffer, jobber, cal, homeserve) maps to the intended domain(s) and that no other workspaces require exclusion.
- Standardize inline comment spacing after
--.packages/tinybird/endpoints/get_lead_event_by_id.pipe (1)
25-25: Verify endpoint visibility aligns with intent.The pipe is described as "internal" (line 4), but declaring
TYPE endpointmakes it publicly accessible via the Tinybird API. Confirm whether this endpoint should be publicly accessible or if it should use a more restrictive type for internal-only use.packages/tinybird/endpoints/coordinates_all.pipe (4)
1-98: Well-structured union endpoint with proper NULL padding.The overall structure is sound:
- Each data node is properly scoped with appropriate filters and limits
- The union correctly pads NULL values for amount/commission in clicks and leads data
- Ordering by timestamp DESC provides consistent temporal sorting across event types
- Token permissions and tags are appropriately configured
20-20: Clarify exclusion of “Ashburn” in Tinybird pipelines
The filtercity != 'Ashburn'appears across multiple.pipefiles; please confirm the business or data-quality reason for excluding Ashburn and add an inline comment if this is a temporary workaround.
62-64: Clarify commission calculation intentThe commission is computed via
arrayElement([0.3, 0.4, 0.5], 1 + (toUnixTimestamp(timestamp) % 3)), yielding pseudo-random rates (30%, 40%, or 50%). This is fine for synthetic/demo data but will produce incorrect figures in production. Add a comment if this is intentional or replace with real business logic.Is this meant solely for testing/demo purposes, or should it reflect actual commission rules?
19-19: Align or document asymmetric time windows
- Clicks use 5 minutes (line 19); Leads and Sales use 12 hours (lines 40, 67).
- Confirm if this discrepancy is intentional and document the rationale; otherwise, align all three intervals for consistent temporal context.
packages/tinybird/pipes/v2_top_links.pipe (6)
1-2: LGTM! Clear description update.The description accurately reflects the new linkIds array filtering capability.
50-58: LGTM! Consistent linkIds array filtering.The PREWHERE optimization correctly excludes the workspace_links subquery when linkIds or linkId are provided, and the WHERE clause properly handles both array filtering (
linkIds IN) and single-value filtering (linkId =) with appropriate precedence.
102-110: LGTM! Consistent implementation.The linkIds filtering logic matches the pattern established in top_links_clicks, maintaining consistency across the pipeline.
140-194: LGTM! Consistent filtering with improved structure.The linkIds/linkId filtering follows the established pattern. The subquery wrapper and
saleAmountalias improve consistency with thetop_links_sales_with_typenode and provide clearer column naming.
207-214: LGTM! Correctly positioned filtering in CTE.The linkIds/linkId filtering is properly placed in the initial
salesCTE, ensuring all downstream CTEs (distinct_sales, min_timestamps) operate on the correctly filtered dataset.
1-312: Approve linkIds array handling
TheArray(linkIds, 'String')pattern is used consistently across pipes and endpoints and maintains backward compatibility withlinkId. No changes required.packages/tinybird/datasources/dub_audit_logs.datasource (1)
1-1: Token configuration looks appropriate.The
TOKEN dub_tinybird_token APPENDdeclaration correctly grants write-only permissions for this datasource, which aligns with the migration strategy of establishing a parallel ingestion path.packages/tinybird/datasources/dub_conversion_events_log.datasource (1)
1-1: LGTM - consistent token configuration.Token declaration follows the same pattern as other datasources in this migration.
packages/tinybird/datasources/dub_webhook_events.datasource (1)
1-1: LGTM.Consistent token configuration for the webhook events datasource.
packages/tinybird/endpoints/get_webhook_events.pipe (1)
1-1: Token permission correctly scoped.Using
READpermission for query-only endpoint is appropriate.packages/tinybird/endpoints/get_audit_logs.pipe (2)
1-1: LGTM.Read-only token appropriate for query endpoint.
19-27: Sorting key covers filter and sort fields Thedub_audit_logsdatasource usesENGINE_SORTING_KEY "workspace_id, program_id, timestamp", which matches the query’s filters and sort, so no further index changes are required.packages/tinybird/pipes/v2_utms.pipe (2)
58-67: LGTM: Clean dual-path link filtering.The refactor correctly handles both
linkIds(array) andlinkId(scalar) with proper PREWHERE optimization when neither is defined. TheWHERE truepattern simplifies conditional filter composition.
314-319: Verify scope of first_sale_ts filter. Themin_timestampsCTE queriesdub_sale_events_mvwithout UTM/date/geography filters, sofirst_sale_tsis the absolute first sale. Should this CTE apply the same filters as thesalesCTE to compute the first sale within the filtered dataset?packages/tinybird/endpoints/get_lead_event.pipe (1)
1-5: LGTM!The token configuration and node setup are appropriate for a read-only endpoint.
packages/tinybird/endpoints/coordinates_sales.pipe (4)
1-1: LGTM: Appropriate READ-only token.The READ-only token permission is correctly scoped for a query endpoint.
25-26: LGTM: Appropriate ordering and limit for visualization.Ordering by timestamp DESC with a 500-record limit is well-suited for a real-time sales coordinate map.
8-14: Commission rotation logic mismatchThe code uses
toUnixTimestamp(timestamp) % 3(rotates per second) but the summary claims yearly rotation. For true yearly rotation:- arrayElement([0.3, 0.4, 0.5], 1 + (toUnixTimestamp(timestamp) % 3)) + arrayElement([0.3, 0.4, 0.5], 1 + (toYear(timestamp) % 3))Confirm the intended business requirement and adjust accordingly.
19-24: Verifydub_sale_events_mvschema
I couldn’t locate its definition in the repo—confirm it exposestimestamp,amount,country,city,latitude, andlongitude. Thecity != 'Ashburn'filter excludes AWS data center traffic.packages/tinybird/endpoints/get_sale_event.pipe (3)
1-1: LGTM!The READ-only token scope is appropriate for a GET endpoint that retrieves sale event data.
20-20: LGTM!The TYPE endpoint declaration is correct for a Tinybird endpoint pipe.
3-18: Clarify%usage and enforce event_id uniqueness
- The lone
%prefix is standard Tinybird SQL DSL (seen in other endpoint pipes) and can remain.- The
dub_sale_events_iddatasource definesevent_id(String) and sorts on it but doesn’t enforce uniqueness; confirm it’s unique or addLIMIT 1to the query.packages/tinybird/endpoints/v2_triggers.pipe (1)
283-286: Verify 'new' sale filter logic.Line 284 filters new sales with
timestamp <= first_sale_ts. Iffirst_sale_tsrepresents the timestamp of the first sale, shouldn't new sales be filtered withtimestamp = first_sale_ts(ortimestamp == first_sale_ts) instead of<=? The current logic would include any sales that occurred before or at the first sale timestamp, which seems semantically incorrect.Please verify the intended semantics:
- If "new sale" means "the first sale event", use
timestamp = first_sale_ts- If "new sale" means "sale at or before the first recorded sale" (which seems unusual), the current logic is correct
{% if defined(saleType) %} - {% if saleType == 'new' %} AND timestamp <= first_sale_ts + {% if saleType == 'new' %} AND timestamp = first_sale_ts {% elif saleType == 'recurring' %} AND timestamp > first_sale_ts {% end %} {% end %}packages/tinybird/endpoints/v2_os.pipe (7)
9-34: LGTM!The workspace_links node correctly implements metadata-based link filtering with appropriate conditional logic for mega folders and various filter parameters. The use of FINAL and Array() helpers is appropriate.
37-93: LGTM!The os_clicks node correctly aggregates click events by operating system with appropriate PREWHERE optimizations and comprehensive filtering options. The support for both linkIds (array) and linkId (single value) parameters aligns with the v2 pattern improvements.
146-204: LGTM with timestamp consistency caveat.The os_sales node correctly aggregates sales by operating system. The
amount AS saleAmountalias on line 150 appears to be for API response consistency across different aggregation types.Note: The same DateTime vs DateTime64 inconsistency mentioned in the os_leads review applies here (lines 199-200).
288-301: LGTM!The os_composite node correctly combines click, lead, and sales data with appropriate LEFT JOINs to preserve all operating systems from the clicks dataset. The conditional selection of os_sales vs os_sales_with_type based on the saleType parameter is implemented correctly.
304-316: LGTM!The endpoint node correctly implements the routing logic to return data from the appropriate upstream node based on the eventType parameter. The nested conditional for sales events to use os_sales_with_type when saleType is defined is correct. Defaulting to os_composite provides a comprehensive view when no specific event type is requested.
277-281: No change needed for 'new' sale timestamp comparison. The<= first_sale_tsfilter is intentionally used across all v2 endpoints to capture simultaneous first‐sale events.
139-140: Standardize timestamp precision in os_leads filtersos_leads uses DateTime(start)/DateTime(end) (lines 139-140) while os_clicks uses DateTime64(start)/DateTime64(end); verify the
timestampcolumn type indub_lead_events_mvand align to DateTime64 or document why DateTime is required.packages/tinybird/endpoints/v2_usage.pipe (4)
52-66: LGTM!The click aggregation logic is sound. Using
PREWHEREfor the link_id filter is a good performance optimization for ClickHouse, and the daily grouping with timezone support is correctly implemented.
70-84: LGTM!The lead aggregation follows the same correct pattern as
usage_clicks_datawith appropriate filtering and grouping.
137-151: LGTM!The revenue aggregation follows the established pattern with proper filtering and grouping.
26-48: Verify daily interval boundaries.
Both v2_usage.pipe (line 44) and v2_timeseries.pipe userange(toUInt32(start + 86400), toUInt32(end + 86400), 86400), which excludes the initial start date and includes the end date. Confirm this off-by-one behavior is intentional.packages/tinybird/materializations/dub_sale_events_pipe.pipe (1)
1-37: LGTM!The materialization pipe follows the established pattern for MV pipes in this migration. The SQL query explicitly selects all relevant fields from
dub_sale_eventsand the datasource naming convention (dub_sale_events_mv) is consistent with other materialization pipes in the PR.packages/tinybird/datasources/dub_sale_events.datasource (1)
1-1: LGTM!The token declaration is consistent with the migration pattern applied across all datasources in this PR. The
APPENDpermission is appropriate for ingestion operations.packages/tinybird/datasources/dub_lead_events.datasource (1)
1-1: LGTM!Consistent token declaration matching the migration pattern.
packages/tinybird/datasources/dub_click_events.datasource (1)
1-1: LGTM!Token declaration follows the established migration pattern.
packages/tinybird/materializations/dub_sale_events_id_pipe.pipe (1)
1-12: Clarify the distinction fromdub_sale_events_pipe.pipe.This pipe materializes
dub_sale_eventsintodub_sale_events_idusingSELECT *, whiledub_sale_events_pipe.pipematerializes the same source intodub_sale_events_mvwith explicit field selection. The differing datasource suffixes (_idvs_mv) and query styles suggest different purposes, but this distinction isn't documented.Consider adding a comment explaining:
- Why two materializations of the same source are needed
- The semantic difference between
_idand_mvsuffixes- Whether
SELECT *vs explicit fields is intentionalapps/web/app/(ee)/api/cron/framer/backfill-leads-batch/route.ts (1)
303-314: Verify error handling strategy for parallel writes.The duplicate write to
TINYBIRD_API_KEY_NEWis inside the samePromise.allblock as the original write and other critical operations (Prisma, Redis). If the new Tinybird write fails, the entire batch operation fails, including database writes.For a migration path, consider whether:
- The current approach (fail-fast on any error) is desired to maintain strict consistency
- The new write should use
waitUntilto fail silently and not block the backfill- Error handling should differentiate between old/new write failures
The current approach ensures data consistency across both Tinybird instances, which is appropriate for a backfill operation where you want confirmation of success. If silent migration is preferred, the new write could be moved to a separate
waitUntilblock.apps/web/lib/tinybird/client.ts (1)
8-12: VerifyTINYBIRD_API_KEY_NEWenvironment variable is configured.The new client uses a type assertion without validation. If
TINYBIRD_API_KEY_NEWis undefined, runtime errors will occur on first API call. Ensure this environment variable is:
- Documented in the migration guide/README
- Configured in all deployment environments
- Added to
.env.exampleor environment variable documentationThe pattern is consistent with the existing
tbclient initialization, so this is not introducing a new issue.packages/tinybird/endpoints/get_click_event.pipe (1)
1-22: LGTM, but verify theORDER BY timestamp DESC LIMIT 1pattern.The endpoint correctly filters by
click_idwith proper parameter validation. However, theORDER BY timestamp DESC LIMIT 1suggests handling multiple rows for the sameclick_id, which is unusual ifclick_idis meant to be unique per event.Verify whether:
click_idcan have multiple entries (e.g., from duplicate ingestion during migration)- The intent is to deduplicate by taking the most recent timestamp
- This pattern is consistent with other
get_*_eventendpointsIf
click_idis unique, consider removing theORDER BYandLIMITclauses for better performance.packages/tinybird/materializations/dub_click_events_id_pipe.pipe (1)
34-36: LGTM!The trigger field derivation logic correctly handles the empty-string case by deriving the trigger type from the
qrflag, with a sensible fallback to 'link' for non-QR clicks.packages/tinybird/pipes/dub_click_events_pipe_with_domain_key.pipe (1)
6-36: LGTM!The pipe correctly enriches click events with domain-key metadata using a LEFT JOIN with FINAL, applies sensible null coalescing for string fields, and maintains consistent trigger derivation logic.
packages/tinybird/materializations/dub_lead_events_pipe.pipe (1)
6-33: LGTM!The materialization pipe correctly defines the lead events MV with all relevant fields, maintaining data consistency for downstream queries.
packages/tinybird/pipes/get_lead_events.pipe (1)
8-9: LGTM!The conditional filter pattern using
{% if defined(customerIds) %}with theWHERE trueclause is clean and allows for flexible querying with optional customer filtering.packages/tinybird/pipes/coordinates_sales.pipe (1)
10-12: Clarify synthetic commission logicThe timestamp-based pseudo-random selection of rates is typical for demo, testing, or prototyping data. Confirm this endpoint isn’t used for production analytics requiring real commission values.
packages/tinybird/datasources/dub_links_metadata.datasource (1)
1-1: LGTM! Token declaration aligns with migration pattern.The TOKEN declaration with APPEND permission correctly enables authenticated writes to this datasource, consistent with the broader Tinybird migration strategy across the codebase.
packages/tinybird/endpoints/get_lead_events.pipe (1)
1-14: LGTM! Endpoint correctly implements filtered lead events query.The endpoint is well-structured with:
- Appropriate READ token for query operations
- Correct Tinybird templating syntax for optional customerIds filtering
- Proper type conversion using Array(customerIds, 'String')
- Logical ordering by timestamp descending
packages/tinybird/materializations/dub_click_events_pipe.pipe (2)
20-27: LGTM! Trigger normalization logic is sound.The CASE expression correctly normalizes the trigger field by:
- Defaulting to 'qr' when trigger is empty and qr flag is true
- Defaulting to 'link' when trigger is empty and qr flag is false
- Preserving existing non-empty trigger values
This ensures consistent trigger values in the materialized view.
30-30: LGTM! Null handling for identity_hash.Using
coalesce(identity_hash, '')ensures the field is never null, which simplifies downstream queries and prevents null-related issues.apps/web/lib/tinybird/record-link.ts (2)
53-58: LGTM! New Tinybird endpoint for migration.The
recordLinkTBNewendpoint is correctly configured with the new Tinybird client (tbNew) using the same schema and wait settings as the original endpoint.
81-89: LGTM! Migration pattern correctly implemented.The function properly implements the dual-ingestion pattern:
waitUntilschedules background submission to the new endpoint without blocking- Original endpoint (
recordLinkTB) continues to execute synchronously- Both array and single payload code paths are updated consistently
This ensures a smooth migration path while maintaining existing behavior.
apps/web/lib/tinybird/record-click-zod.ts (1)
37-53: LGTM! Migration pattern correctly implemented.The changes properly implement the dual-ingestion strategy:
recordClickZodTBandrecordClickZodTBNeware both exported for flexibilityrecordClickZodwrapper schedules background ingestion to new endpoint viawaitUntil- Original endpoint continues synchronous execution
- Schema remains unchanged across both endpoints
This maintains backward compatibility while enabling the migration path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 26
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
packages/tinybird/pipes/v2_events.pipe (1)
140-159: Quote metadata keys/values before injecting into SQL.
JSONExtractString(metadata, {{ metadataKey }})emits the raw key (e.g.firstName), which ClickHouse treats as an identifier and crashes with “Unknown identifier.” Wrap both operands inString(...)(e.g.JSONExtractString(metadata, {{ String(metadataKey) }}) = {{ String(value) }}) so they render as literals. Without that change, any metadata filter immediately breaks the query.packages/tinybird/pipes/v2_devices.pipe (2)
1-4: Fix the devices pipe description.Line 2 continues to read “Top countries,” but this pipe reports device stats. Please update the DESCRIPTION accordingly (e.g., “Top devices”).
209-283: Apply the ORDER BY fix to the sale-type variant too.
devices_sales_with_typehas the identical structure and still omits an outerORDER BY. AddORDER BY amount DESCafter the subquery to keep output stable.ORDER BY amount DESC - ) subquery + ) subquery + ORDER BY amount DESC
♻️ Duplicate comments (38)
packages/tinybird/pipes/v2_triggers.pipe (1)
50-53: Still missing the tenantId guard — leaks cross‑tenant dataTenant-only requests still skip the
workspace_linksPREWHERE, sotrigger_*reads return every tenant’s links. Please adddefined(tenantId)to the guard (here and in the other nodes) so tenant-scoped calls get filtered.-{% elif not defined(linkId) and not defined(linkIds) and ( - defined(workspaceId) or defined(partnerId) or defined(programId) -) %} PREWHERE link_id in (SELECT link_id from workspace_links) +{% elif not defined(linkId) and not defined(linkIds) and ( + defined(workspaceId) or defined(partnerId) or defined(programId) or defined(tenantId) +) %} PREWHERE link_id in (SELECT link_id from workspace_links)Also applies to: 101-103, 157-159, 217-219
packages/tinybird/endpoints/v2_top_tags.pipe (1)
253-256: Fix PREWHERE source to reference an existing nodeThe
PREWHEREsubquery selects fromworkspace_links, but that node doesn't exist in this pipe, so compilation will fail. Reuseworkspace_links_with_tags, which already materializes the relevant link_ids.- {% if not defined(linkId) and (defined(workspaceId) or defined(partnerId) or defined(programId)) %} - PREWHERE link_id in (SELECT link_id from workspace_links) + {% if not defined(linkId) and (defined(workspaceId) or defined(partnerId) or defined(programId)) %} + PREWHERE link_id IN ( + SELECT link_id + FROM workspace_links_with_tags + )packages/tinybird/pipes/v2_count.pipe (1)
127-145: Quote metadata operands before injecting into JSONExtractString.
metadataKeyandvaluerender without quotes, so the SQL becomesJSONExtractString(metadata, plan) = premium. ClickHouse interprets both tokens as identifiers, causing every query using the new metadata filters to fail. Wrap both operands withString(...)(or otherwise emit quoted literals) in all three metadata filter blocks.- AND JSONExtractString(metadata, {{ metadataKey }}) = {{ value }} + AND JSONExtractString(metadata, {{ String(metadataKey) }}) = {{ String(value) }}Apply the same fix to the other comparison branches (notEquals, greaterThan, etc.) and to the duplicate filter blocks below.
Also applies to: 208-226, 289-307
packages/tinybird/pipes/v2_referer_urls.pipe (1)
42-44: Restore tenant-scoped PREWHERE guardLine 42 still omits the
defined(tenantId)check, so tenant-only queries bypass theworkspace_linksfilter and leak other tenants’ data. Please reintroduce the tenant guard (and mirror it for the leads / sales / sales_with_type nodes).- {% if not defined(linkId) and not defined(linkIds) and ( - defined(workspaceId) or defined(partnerId) or defined(programId) - ) %} PREWHERE link_id in (SELECT link_id from workspace_links) {% end %} + {% if not defined(linkId) and not defined(linkIds) and ( + defined(workspaceId) or defined(partnerId) or defined(programId) or defined(tenantId) + ) %} PREWHERE link_id in (SELECT link_id from workspace_links) {% end %}Also applies to: 93-95, 151-153, 217-218
packages/tinybird/pipes/v2_os.pipe (1)
50-52: Restore PREWHERE guards for metadata-only filtersBecause these PREWHERE blocks only trigger when
workspaceId,partnerId, orprogramIdare present, requests that usetenantId,folderId(s),domain,tagIds, orrootwithout alinkId(s)skip theworkspace_linksscope entirely and leak cross-workspace data. Please expand the conditional in every PREWHERE (clicks, leads, sales, sales_with_type) to include all parameters supported byworkspace_links.- {% elif not defined(linkId) and not defined(linkIds) and ( - defined(workspaceId) or defined(partnerId) or defined(programId) - ) %} PREWHERE link_id in (SELECT link_id from workspace_links) + {% elif not defined(linkId) and not defined(linkIds) and ( + defined(workspaceId) or defined(partnerId) or defined(programId) or + defined(tenantId) or defined(folderId) or defined(folderIds) or + defined(domain) or defined(tagIds) or defined(root) + ) %} PREWHERE link_id in (SELECT link_id from workspace_links)Repeat the same expansion in the corresponding blocks for
os_leads,os_sales, andos_sales_with_typeso every metadata filter path keeps the query scoped.Also applies to: 101-103, 155-156, 215-216
packages/tinybird/endpoints/v2_usage.pipe (5)
7-22: AddisMegaFolderparameter for consistency with other v2 endpoints.This node still lacks the
isMegaFolderconditional that other v2 pipes use to choose betweendub_links_metadata_latest(when true) anddub_regular_links_metadata_latest(when false). This inconsistency makes the endpoint behave differently from similar v2 endpoints.
88-95: Fix NULL handling in arithmetic operations.Line 92 still uses
(clicks + leads)without NULL coalescing. When eitherclicksorleadsisNULL(no events on that day), the entire expression evaluates toNULLinstead of the expected sum.Apply this diff to fix the NULL handling:
- SELECT - formatDateTime(di.interval, '%FT%T.000%z') as date, clicks, leads, (clicks + leads) as value + SELECT + formatDateTime(di.interval, '%FT%T.000%z') as date, + coalesce(clicks, 0) as clicks, + coalesce(leads, 0) as leads, + coalesce(clicks, 0) + coalesce(leads, 0) as value FROM day_intervals as di
99-123: Resolve table inconsistency between nodes.This node queries
dub_links_metadata_latest(all link types) whileworkspace_linksqueriesdub_regular_links_metadata_latest(regular links only). This means:
- Link count metrics include all link types
- Click/lead/revenue metrics are filtered to regular links only
This inconsistency can produce confusing analytics where link counts don't match the links generating events.
127-133: Add NULL handling for missing link counts.Line 131 returns
NULLfor days with no links created instead of0.Apply this diff to fix NULL handling:
- SELECT formatDateTime(interval, '%FT%T.000%z') as date, links as value + SELECT formatDateTime(interval, '%FT%T.000%z') as date, coalesce(links, 0) as value FROM day_intervals
155-161: Add NULL handling for missing revenue.Line 159 returns
NULLfor days with no sales instead of0.Apply this diff to fix NULL handling:
- SELECT formatDateTime(interval, '%FT%T.000%z') as date, revenue as value + SELECT formatDateTime(interval, '%FT%T.000%z') as date, coalesce(revenue, 0) as value FROM day_intervalspackages/tinybird/endpoints/v2_utms.pipe (1)
316-321: Fix invalidINreference to the CTE.The syntax
WHERE (customer_id, link_id) IN distinct_salesis invalid in ClickHouse/Tinybird. TheINoperator requires a subquery, not a direct CTE reference. This will fail at runtime when the pipe is deployed.Apply this diff to wrap the CTE reference in a subquery:
min_timestamps AS ( SELECT customer_id, link_id, min(timestamp) AS first_sale_ts FROM dub_sale_events_mv - WHERE (customer_id, link_id) IN distinct_sales + WHERE (customer_id, link_id) IN ( + SELECT customer_id, link_id + FROM distinct_sales + ) GROUP BY customer_id, link_id )packages/tinybird/endpoints/v2_referer_urls.pipe (4)
95-97: Critical: Tenant-scoped queries bypass workspace_links filter.Identical issue as
referer_urls_clicks: the PREWHERE guard omitsdefined(tenantId), allowing tenant-scoped queries to skip theworkspace_linksfilter and return data across all tenants.Apply this diff to include
tenantIdin the guard condition:- {% if not defined(linkId) and not defined(linkIds) and ( - defined(workspaceId) or defined(partnerId) or defined(programId) - ) %} PREWHERE link_id in (SELECT link_id from workspace_links) {% end %} + {% if not defined(linkId) and not defined(linkIds) and ( + defined(workspaceId) or defined(partnerId) or defined(programId) or defined(tenantId) + ) %} PREWHERE link_id in (SELECT link_id from workspace_links) {% end %}
44-46: Critical: Tenant-scoped queries bypass workspace_links filter.The PREWHERE guard condition omits
defined(tenantId), so queries scoped only bytenantId(withoutworkspaceId/partnerId/programId) skip theworkspace_linksfilter entirely. This allows tenant-scoped queries to return data across all tenants—a critical access control violation.Apply this diff to include
tenantIdin the guard condition:- {% if not defined(linkId) and not defined(linkIds) and ( - defined(workspaceId) or defined(partnerId) or defined(programId) - ) %} PREWHERE link_id in (SELECT link_id from workspace_links) {% end %} + {% if not defined(linkId) and not defined(linkIds) and ( + defined(workspaceId) or defined(partnerId) or defined(programId) or defined(tenantId) + ) %} PREWHERE link_id in (SELECT link_id from workspace_links) {% end %}
218-220: Critical: Tenant-scoped queries bypass workspace_links filter.Identical issue as previous nodes: the PREWHERE guard omits
defined(tenantId), allowing tenant-scoped queries to skip theworkspace_linksfilter and return data across all tenants.Apply this diff to include
tenantIdin the guard condition:- {% if not defined(linkId) and not defined(linkIds) and ( - defined(workspaceId) or defined(partnerId) or defined(programId) - ) %} PREWHERE link_id in (SELECT link_id from workspace_links) {% end %} + {% if not defined(linkId) and not defined(linkIds) and ( + defined(workspaceId) or defined(partnerId) or defined(programId) or defined(tenantId) + ) %} PREWHERE link_id in (SELECT link_id from workspace_links) {% end %}
152-154: Critical: Tenant-scoped queries bypass workspace_links filter.Identical issue as previous nodes: the PREWHERE guard omits
defined(tenantId), allowing tenant-scoped queries to skip theworkspace_linksfilter and return data across all tenants.Apply this diff to include
tenantIdin the guard condition:- {% if not defined(linkId) and not defined(linkIds) and ( - defined(workspaceId) or defined(partnerId) or defined(programId) - ) %} PREWHERE link_id in (SELECT link_id from workspace_links) {% end %} + {% if not defined(linkId) and not defined(linkIds) and ( + defined(workspaceId) or defined(partnerId) or defined(programId) or defined(tenantId) + ) %} PREWHERE link_id in (SELECT link_id from workspace_links) {% end %}packages/tinybird/pipes/get_framer_lead_events.pipe (1)
4-9: Remove hard-coded link/customer defaults to avoid unintended data exposure.Leaving real link and customer IDs as default arrays means anyone hitting this endpoint without parameters still receives production data for those identifiers. Please drop the inline defaults and require callers to supply explicit filters (or substitute safe placeholders).
packages/tinybird/pipes/v2_continents.pipe (1)
50-58: Still missing tenant-scoped PREWHERE guardWhen a request supplies only
tenantId, we still skip thelink_id IN (SELECT … FROM workspace_links)filter, so the query scans every tenant’s data. That regresses isolation and reintroduces the cross-tenant leak we called out earlier. Please adddefined(tenantId)(and any other tenant-level identifier you expect) to this guard in every node (continents_clicks,continents_leads,continents_sales,continents_sales_with_type). Example fix:-{% elif not defined(linkId) and not defined(linkIds) and ( - defined(workspaceId) or defined(partnerId) or defined(programId) -) %} PREWHERE link_id in (SELECT link_id from workspace_links) +{% elif not defined(linkId) and not defined(linkIds) and ( + defined(workspaceId) + or defined(partnerId) + or defined(programId) + or defined(tenantId) +) %} PREWHERE link_id in (SELECT link_id from workspace_links)Be sure to mirror the same guard in the other nodes so behaviour stays consistent.
Also applies to: 101-109, 148-199, 214-259
packages/tinybird/endpoints/v2_os.pipe (1)
3-4: Update description to match OS endpoint.Line 4 still says “Top countries,” but this pipe surfaces operating-system metrics. Please align the DESCRIPTION with the actual content (e.g., “Top operating systems”).
packages/tinybird/pipes/v2_devices.pipe (1)
148-204: Bring back the outer ORDER BY for device sales.Same as the earlier review: once the aggregation moved into a subquery we lost deterministic ordering. Append
ORDER BY amount DESCafter the outer SELECT so TTL consumers keep receiving sales sorted by amount.ORDER BY amount DESC - ) subquery + ) subquery + ORDER BY amount DESCapps/web/lib/tinybird/record-click.ts (1)
180-190: Update operations array for accurate error logging.The duplicate Tinybird ingestion for migration is correctly implemented. However, the
operationsarray (lines 215-220) still has only 4 entries whilePromise.allSettlednow includes 5 promises. When the new tbNew ingestion fails, the error index will be misattributed in logs.Add an entry for the new ingestion:
const operations = [ "Tinybird click event ingestion", + "Tinybird NEW click event ingestion", "recordClickCache set", "Link clicks increment", "Workspace usage increment", ];Also applies to: 215-220
packages/tinybird/pipes/get_sale_event.pipe (1)
10-16: Resolve parameter contradiction.The
eventIdparameter has bothrequired=True(line 14) and a default value"O9Q8fKbfY809HXBp"(line 12), which is contradictory. A required parameter should not have a default.Choose one approach:
- Remove the default to make it truly required:
= {{ String( eventId, - "O9Q8fKbfY809HXBp", description="The unique event ID.", required=True, ) }}
- Remove required=True to allow the default:
= {{ String( eventId, "O9Q8fKbfY809HXBp", description="The unique event ID.", - required=True, ) }}packages/tinybird/endpoints/get_lead_event_by_id.pipe (1)
3-4: Fix inconsistent description.The DESCRIPTION states "internal pipe for updating lead event" but the SQL query performs a SELECT operation (reading data), not an update. The pipe name
get_lead_event_by_idalso confirms this is a read operation.Apply this diff to correct the description:
DESCRIPTION > - internal pipe for updating lead event + internal pipe for retrieving a lead event by IDapps/web/lib/api/conversions/track-lead.ts (1)
7-7: Breaking change: synchronous lead recording removed formode === "wait".The removal of
recordLeadSyncand the change tomode !== "deferred"means thatmode === "wait"no longer waits for Tinybird ingestion to complete before returning. Lead recording now happens asynchronously viawaitUntilfor all non-deferred modes.This breaks the contract for callers expecting
mode: "wait"to confirm Tinybird recording before the response is sent. Callers now only get confirmation of Redis caching.Consider:
- Documenting this breaking change
- Restoring synchronous behavior for
mode === "wait"- Or introducing a new
mode === "sync"if synchronous recording is still neededAlso applies to: 235-237
packages/tinybird/endpoints/get_lead_event.pipe (1)
6-21: Past review concern remains unaddressed.The missing
LIMITclause issue flagged in the previous review has not been resolved. The query still lacks bounds and can return an unbounded result set when multiple lead events exist for a customer, despite the singular endpoint name suggesting a single record.packages/tinybird/endpoints/get_audit_logs.pipe (1)
7-28: Past review concern remains unaddressed.The missing
LIMITclause issue flagged in the previous review has not been resolved. Without a limit, this endpoint can return an unbounded number of audit log records, causing performance issues and excessive data transfer as the table grows.apps/web/lib/tinybird/record-webhook-event.ts (1)
16-19: Critical issue from previous review remains unresolved.The
waitUntilcall still lacks error handling. IfrecordWebhookEventTBNewrejects, the exception will propagate and fail the request even though the primarytbingest succeeds—exactly the reliability regression flagged previously.packages/tinybird/pipes/v2_countries.pipe (1)
50-58: Reapply the tenant/metadata PREWHERE guard across nodesAs previously flagged, the PREWHERE short-circuit only fires for
workspaceId|partnerId|programId. Tenant-only requests (and folder/domain/tag/root scopes) still bypass theworkspace_linksfilter, leaking cross-tenant data from the event tables that lack those columns.Please update the guard in
countries_clicks,countries_leads,countries_sales, andcountries_sales_with_typeto includetenantIdand the other metadata filters you pipe through (folderIds/folderId, domain, tagIds, root) so every metadata-only scope reuses the metadata-subquery.- {% elif not defined(linkId) and not defined(linkIds) and ( - defined(workspaceId) or defined(partnerId) or defined(programId) - ) %} PREWHERE link_id in (SELECT link_id from workspace_links) {% end %} + {% elif not defined(linkId) and not defined(linkIds) and ( + defined(workspaceId) + or defined(partnerId) + or defined(programId) + or defined(tenantId) + or defined(folderIds) + or defined(folderId) + or defined(domain) + or defined(tagIds) + or defined(root) + ) %} PREWHERE link_id in (SELECT link_id from workspace_links) {% end %}Same change needs to be applied to every node listed above to keep tenant data isolated.
Also applies to: 90-98, 142-150, 202-210
packages/tinybird/pipes/v2_browsers.pipe (1)
45-53: Reinstate full metadata PREWHERE guardLimiting the PREWHERE to workspaceId/partnerId/programId means tenant/folder/tag/domain/root-only filters no longer constrain results, exposing unrelated data. Add those flags to the condition, consistent with
workspace_links.- {% elif not defined(linkId) and not defined(linkIds) and ( - defined(workspaceId) or defined(partnerId) or defined(programId) - ) %} PREWHERE link_id in (SELECT link_id from workspace_links) + {% elif not defined(linkId) and not defined(linkIds) and ( + defined(workspaceId) or defined(partnerId) or defined(programId) or defined(tenantId) + or defined(folderId) or defined(folderIds) or defined(tagIds) or defined(domain) or defined(root) + ) %} PREWHERE link_id in (SELECT link_id from workspace_links)Replicate across clicks/leads/sales/sales_with_type nodes.
Also applies to: 101-103, 154-156, 214-216
apps/web/lib/tinybird/log-import-error.ts (1)
16-18: GuardwaitUntilagainst background failures
waitUntil(logImportErrorTBNew(payload))will reject the invocation whenever the tbNew write fails, breaking the legacy ingest path we’re trying to keep authoritative. Wrap the promise with a.catchso failures are logged but don’t bubble up.export const logImportError = async (payload: any) => { - waitUntil(logImportErrorTBNew(payload)); + waitUntil( + logImportErrorTBNew(payload).catch((error) => { + console.error("tbNew import-error ingest failed", error); + }), + ); return await logImportErrorTB(payload); };packages/tinybird/endpoints/v2_regions.pipe (1)
52-54: QuotelinkIdparameters
AND link_id = {{ linkId }}renders without quotes, so ClickHouse interprets the value as an identifier and errors (or hits the wrong column). Wrap the parameter withString()everywhere it appears.- {% elif defined(linkId) %} AND link_id = {{ linkId }} + {% elif defined(linkId) %} AND link_id = {{ String(linkId) }}Repeat this fix for every node in the file.
Also applies to: 136-138, 220-222, 304-306
packages/tinybird/endpoints/v2_cities.pipe (1)
52-55: Restore tenant guard in every PREWHERE branch.When only
tenantIdis provided, the PREWHERE never runs, so the query skips theworkspace_linksrestriction and leaks cross-tenant data. Please adddefined(tenantId)to the guard here and in the other nodes that use the same predicate (cities_leads,cities_sales,cities_sales_with_type).packages/tinybird/endpoints/v2_top_programs.pipe (1)
158-164: Still excluding programs without clicks
top_programs_compositeis keyed offtop_programs_clicks, so any program that only has leads or sales is still dropped. Please union the program IDs from all metric nodes and coalesce null aggregates before ordering so every qualifying program shows up.NODE top_programs_composite SQL > - SELECT c.programId as programId, clicks, leads, sales, saleAmount - FROM top_programs_clicks AS c - LEFT JOIN top_programs_leads AS l ON c.programId = l.programId - LEFT JOIN top_programs_sales AS s ON c.programId = s.programId - ORDER BY saleAmount DESC + WITH program_ids AS ( + SELECT programId FROM top_programs_clicks + UNION DISTINCT + SELECT programId FROM top_programs_leads + UNION DISTINCT + SELECT programId FROM top_programs_sales + ) + SELECT + p.programId, + coalesce(c.clicks, 0) AS clicks, + coalesce(l.leads, 0) AS leads, + coalesce(s.sales, 0) AS sales, + coalesce(s.saleAmount, 0) AS saleAmount + FROM program_ids AS p + LEFT JOIN top_programs_clicks AS c ON p.programId = c.programId + LEFT JOIN top_programs_leads AS l ON p.programId = l.programId + LEFT JOIN top_programs_sales AS s ON p.programId = s.programId + ORDER BY saleAmount DESCpackages/tinybird/pipes/v2_regions.pipe (1)
39-40: Drop the trailing commas to keep the SQL validEach of these SELECT clauses still ends with a trailing comma, so the pipe won’t compile. Remove the commas at the end of the projection list across the affected nodes.
- SELECT CONCAT(country, '-', region) as region, country, clicks, + SELECT CONCAT(country, '-', region) as region, country, clicks … - SELECT CONCAT(country, '-', region) as region, country, leads, + SELECT CONCAT(country, '-', region) as region, country, leads … - SELECT CONCAT(country, '-', region) as region, country, sales, amount, amount AS saleAmount + SELECT CONCAT(country, '-', region) as region, country, sales, amount, amount AS saleAmount … - SELECT CONCAT(country, '-', region) as region, country, sales, amount, amount AS saleAmount + SELECT CONCAT(country, '-', region) as region, country, sales, amount, amount AS saleAmountAlso applies to: 123-124, 207-208, 371-372
packages/tinybird/endpoints/v2_continents.pipe (1)
46-54: Critical: Cross-workspace data leak when customerId is providedWhen
customerIdis set without explicitlinkId(s), the PREWHERE switches to the lead-backedclick_idlookup but drops theworkspace_linksfilter. This returns all clicks for that customer across every workspace/program/partner, violating workspace isolation.Apply the fix from the previous review:
- {% if defined(customerId) %} - PREWHERE click_id IN ( - SELECT DISTINCT click_id - FROM dub_lead_events_mv - WHERE customer_id = {{ String(customerId) }} - ) - {% elif not defined(linkId) and not defined(linkIds) and ( - defined(workspaceId) or defined(partnerId) or defined(programId) - ) %} PREWHERE link_id in (SELECT link_id from workspace_links) + {% if defined(customerId) %} + PREWHERE + click_id IN ( + SELECT DISTINCT click_id + FROM dub_lead_events_mv + WHERE customer_id = {{ String(customerId) }} + ) + {% if not defined(linkId) and not defined(linkIds) and ( + defined(workspaceId) or defined(partnerId) or defined(programId) + ) %} AND link_id in (SELECT link_id from workspace_links) {% end %} + {% elif not defined(linkId) and not defined(linkIds) and ( + defined(workspaceId) or defined(partnerId) or defined(programId) + ) %} PREWHERE link_id in (SELECT link_id from workspace_links) {% end %}packages/tinybird/endpoints/v2_triggers.pipe (4)
3-4: Past issue remains unaddressed: Incorrect pipe description.The DESCRIPTION still states "Top countries" but this pipe aggregates trigger metrics (clicks, leads, sales), not geographic data.
Update the description:
DESCRIPTION > - Top countries + Trigger metrics aggregation for clicks, leads, and sales
139-140: Past issue remains unaddressed: Inconsistent timestamp casting.Lines 139-140 still use
DateTime()whiletrigger_clicks(lines 89-90) usesDateTime64(), causing type inconsistency.Standardize on
DateTime64():- {% if defined(start) %} AND timestamp >= {{ DateTime(start) }} {% end %} - {% if defined(end) %} AND timestamp <= {{ DateTime(end) }} {% end %} + {% if defined(start) %} AND timestamp >= {{ DateTime64(start) }} {% end %} + {% if defined(end) %} AND timestamp <= {{ DateTime64(end) }} {% end %}
202-203: Past issue remains unaddressed: Inconsistent timestamp casting.Lines 202-203 still use
DateTime()instead ofDateTime64(), inconsistent withtrigger_clicks.Apply the same fix:
- {% if defined(start) %} AND timestamp >= {{ DateTime(start) }} {% end %} - {% if defined(end) %} AND timestamp <= {{ DateTime(end) }} {% end %} + {% if defined(start) %} AND timestamp >= {{ DateTime64(start) }} {% end %} + {% if defined(end) %} AND timestamp <= {{ DateTime64(end) }} {% end %}
262-263: Past issue remains unaddressed: Inconsistent timestamp casting.Lines 262-263 still use
DateTime()instead ofDateTime64().Standardize the timestamp casting:
- {% if defined(start) %} AND timestamp >= {{ DateTime(start) }} {% end %} - {% if defined(end) %} AND timestamp <= {{ DateTime(end) }} {% end %} + {% if defined(start) %} AND timestamp >= {{ DateTime64(start) }} {% end %} + {% if defined(end) %} AND timestamp <= {{ DateTime64(end) }} {% end %}
🧹 Nitpick comments (12)
packages/tinybird/pipes/get_lead_event_by_id.pipe (1)
9-21: Consider addingLIMIT 1for clarity and performance.Since
event_idis described as "The unique event ID," the query should return at most one row. AddingLIMIT 1clarifies this intent and may improve performance. TheORDER BY timestamp DESCclause is also unnecessary for a unique identifier lookup unless there's a specific reason to expect multiple rows.Apply this diff to add the LIMIT clause:
SELECT * FROM dub_lead_events WHERE event_id = {{ String( eventId, "nLvob3FdmsYuI1BI", description="The unique event ID.", required=True, ) }} - ORDER BY timestamp DESC + ORDER BY timestamp DESC + LIMIT 1Alternatively, if
event_idis guaranteed to be unique, you can remove theORDER BYclause entirely:SELECT * FROM dub_lead_events WHERE event_id = {{ String( eventId, "nLvob3FdmsYuI1BI", description="The unique event ID.", required=True, ) }} - ORDER BY timestamp DESC + LIMIT 1packages/tinybird/endpoints/get_import_error_logs.pipe (2)
3-4: Consider a more specific description.The description says "Get import logs by ID" but the endpoint retrieves import error logs specifically. Consider updating to "Get import error logs by ID" for clarity.
DESCRIPTION > - Get import logs by ID + Get import error logs by ID
10-31: Consider pagination and explicit column selection.The query has two potential improvements:
High row limit (5000): Without pagination parameters (offset/limit), clients cannot fetch logs beyond the first 5000 rows. For imports with many errors, this could be problematic.
SELECT *: Explicitly selecting required columns rather than using
SELECT *can improve query performance and reduce data transfer, especially if the table grows to include additional metadata columns.Consider adding pagination support:
% - SELECT * + SELECT + timestamp, + workspace_id, + import_id, + error_message, + row_data FROM dub_import_error_logs WHERE workspace_id = {{ String( workspaceId, 'wh_fjs8YfXkgsFL7eF7LOBvbluDb', required=True, ) }} AND import_id = {{ String( importId, 'uC2DOqy', required=True, ) }} ORDER BY timestamp DESC - limit 5000 + LIMIT {{ Int32(limit, 100) }} + OFFSET {{ Int32(offset, 0) }}Note: Adjust the column list to match the actual schema of
dub_import_error_logs.packages/tinybird/pipes/v2_referers.pipe (1)
138-192: Consider simplifying the subquery structure.The outer subquery (lines 138-142, 192) only aliases
amount AS saleAmount, which could be done directly in the inner query's SELECT clause. The extra nesting adds complexity without clear benefit.Consider this flatter structure:
- SELECT referer, sales, amount, amount AS saleAmount - FROM - ( - SELECT referer, COUNT(referer) as sales, sum(amount) as amount + SELECT referer, COUNT(referer) as sales, sum(amount) as amount, sum(amount) AS saleAmount FROM dub_sale_events_mv {% if not defined(linkId) and not defined(linkIds) and ( defined(workspaceId) or defined(partnerId) or defined(programId) ) %} PREWHERE link_id in (SELECT link_id from workspace_links) {% end %} WHERE true {% if defined(linkIds) %} AND link_id IN {{ Array(linkIds, 'String') }} {% elif defined(linkId) %} AND link_id = {{ linkId }} {% end %} {% if defined(customerId) %} AND customer_id = {{ String(customerId) }} {% end %} {% if defined(qr) %} AND qr = {{ Boolean(qr) }} {% end %} {% if defined(continent) %} AND continent = {{ continent }} {% end %} {% if defined(country) %} AND country = {{ country }} {% end %} {% if defined(region) %} AND region = {{ region }} {% end %} {% if defined(city) %} AND city = {{ city }} {% end %} {% if defined(device) %} AND device = {{ device }} {% end %} {% if defined(browser) %} AND browser = {{ browser }} {% end %} {% if defined(os) %} AND os = {{ os }} {% end %} {% if defined(referer) %} AND referer = {{ referer }} {% end %} {% if defined(refererUrl) %} AND splitByString('?', referer_url)[1] = {{ refererUrl }} {% end %} {% if defined(utm_source) %} AND url LIKE concat('%utm_source=', encodeURLFormComponent({{ String(utm_source) }}), '%') {% end %} {% if defined(utm_medium) %} AND url LIKE concat('%utm_medium=', encodeURLFormComponent({{ String(utm_medium) }}), '%') {% end %} {% if defined(utm_campaign) %} AND url LIKE concat( '%utm_campaign=', encodeURLFormComponent({{ String(utm_campaign) }}), '%' ) {% end %} {% if defined(utm_term) %} AND url LIKE concat('%utm_term=', encodeURLFormComponent({{ String(utm_term) }}), '%') {% end %} {% if defined(utm_content) %} AND url LIKE concat('%utm_content=', encodeURLFormComponent({{ String(utm_content) }}), '%') {% end %} {% if defined(url) %} AND url = {{ url }} {% end %} {% if defined(start) %} AND timestamp >= {{ DateTime(start) }} {% end %} {% if defined(end) %} AND timestamp <= {{ DateTime(end) }} {% end %} GROUP BY referer ORDER BY amount DESC LIMIT 1000 - ) as subqueryThat said, the current structure works correctly and the dual-mode linkIds filtering is properly implemented throughout.
packages/tinybird/pipes/coordinates_sales.pipe (1)
22-22: Document the rationale for excluding "Ashburn".The query specifically excludes the city "Ashburn" alongside "Unknown" values. This hardcoded exclusion suggests a business rule or data quality issue.
Please add a comment explaining why Ashburn is excluded (e.g., test data, data center location, known data quality issue).
AND city != 'Unknown' + -- Exclude Ashburn (data center / test traffic location) AND city != 'Ashburn'packages/tinybird/endpoints/get_webhook_events.pipe (2)
23-24: Consider pagination for webhook events beyond 100.The endpoint returns a maximum of 100 webhook events ordered by
timestamp DESC. While this may be sufficient for many use cases, consider:
- Ensure an index exists on
timestamp(ideally a composite index onwebhook_id, timestamp) for optimal query performance.- If clients need to retrieve older events beyond the 100-result limit, you may want to add cursor-based or offset-based pagination parameters (e.g.,
beforeTimestamporoffset).
7-22: Confirm datasource schema and index; use explicit column selection.
Thedub_webhook_eventsdatasource defineswebhook_idin its schema, and its MergeTree sorting key includeswebhook_id, so indexing is already in place. ReplaceSELECT *with an explicit column list to limit data transfer and avoid exposing unused or sensitive fields.packages/tinybird/endpoints/get_lead_events.pipe (1)
1-14: Consider adding a trailing newline.The endpoint structure is correct and follows Tinybird patterns. However, the file lacks a trailing newline, which is a common convention in many codebases.
packages/tinybird/endpoints/get_sale_event.pipe (1)
1-20: LGTM with minor formatting suggestion.The endpoint structure is correct and follows Tinybird conventions. The
eventIdparameter validation withrequired=Trueis appropriate.Consider adding a trailing newline for consistency with common formatting conventions.
apps/web/lib/tinybird/record-link.ts (1)
86-97: Deduplicate transformLinkTB work before ingestingWe run
transformLinkTBtwice per payload (once for tbNew, again for tb), which redoes the decode/mapping logic on the same objects. Cache the transformed array/object once and feed both endpoints to cut this hot-path work in half.if (Array.isArray(payload)) { - waitUntil( - recordLinkTBNew( - payload.map(transformLinkTB).map((p) => ({ ...p, deleted })), - ), - ); - return await recordLinkTB( - payload.map(transformLinkTB).map((p) => ({ ...p, deleted })), - ); + const records = payload.map((link) => ({ + ...transformLinkTB(link), + deleted, + })); + waitUntil(recordLinkTBNew(records)); + return await recordLinkTB(records); } else { - waitUntil(recordLinkTBNew({ ...transformLinkTB(payload), deleted })); - return await recordLinkTB({ ...transformLinkTB(payload), deleted }); + const record = { ...transformLinkTB(payload), deleted }; + waitUntil(recordLinkTBNew(record)); + return await recordLinkTB(record); }packages/tinybird/endpoints/v2_triggers.pipe (2)
19-19: Consider using standard SQL equality operator.Line 19 uses
deleted == 0with double equals. While ClickHouse supports both==and=, the single=is more standard SQL and used elsewhere in the codebase for consistency.- deleted == 0 + deleted = 0
150-150: Redundant column aliasing.Line 150 returns both
amountandsaleAmountwith identical values. This creates duplicate columns in the result set. Consider returning onlysaleAmountor document why both are needed.If only one column is needed:
- SELECT trigger, sales, amount, amount AS saleAmount + SELECT trigger, sales, amount AS saleAmount
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (93)
apps/web/app/(ee)/api/cron/cleanup/link-retention/route.ts(2 hunks)apps/web/app/(ee)/api/cron/framer/backfill-leads-batch/route.ts(1 hunks)apps/web/lib/actions/partners/revoke-program-invite.ts(2 hunks)apps/web/lib/api/audit-logs/record-audit-log.ts(3 hunks)apps/web/lib/api/conversions/track-lead.ts(2 hunks)apps/web/lib/api/links/bulk-delete-links.ts(2 hunks)apps/web/lib/api/links/delete-link.ts(2 hunks)apps/web/lib/tinybird/client.ts(1 hunks)apps/web/lib/tinybird/log-conversion-events.ts(2 hunks)apps/web/lib/tinybird/log-import-error.ts(1 hunks)apps/web/lib/tinybird/record-click-zod.ts(2 hunks)apps/web/lib/tinybird/record-click.ts(1 hunks)apps/web/lib/tinybird/record-lead.ts(1 hunks)apps/web/lib/tinybird/record-link.ts(3 hunks)apps/web/lib/tinybird/record-sale.ts(1 hunks)apps/web/lib/tinybird/record-webhook-event.ts(1 hunks)apps/web/scripts/bulk-delete-links.ts(2 hunks)packages/tinybird/datasources/dub_audit_logs.datasource(1 hunks)packages/tinybird/datasources/dub_click_events.datasource(1 hunks)packages/tinybird/datasources/dub_click_events_id.datasource(1 hunks)packages/tinybird/datasources/dub_conversion_events_log.datasource(1 hunks)packages/tinybird/datasources/dub_import_error_logs.datasource(1 hunks)packages/tinybird/datasources/dub_lead_events.datasource(1 hunks)packages/tinybird/datasources/dub_links_metadata.datasource(1 hunks)packages/tinybird/datasources/dub_sale_events.datasource(1 hunks)packages/tinybird/datasources/dub_sale_events_id.datasource(1 hunks)packages/tinybird/datasources/dub_webhook_events.datasource(1 hunks)packages/tinybird/endpoints/all_stats.pipe(1 hunks)packages/tinybird/endpoints/coordinates_all.pipe(1 hunks)packages/tinybird/endpoints/coordinates_sales.pipe(1 hunks)packages/tinybird/endpoints/get_audit_logs.pipe(1 hunks)packages/tinybird/endpoints/get_click_event.pipe(1 hunks)packages/tinybird/endpoints/get_framer_lead_events.pipe(1 hunks)packages/tinybird/endpoints/get_import_error_logs.pipe(1 hunks)packages/tinybird/endpoints/get_lead_event.pipe(1 hunks)packages/tinybird/endpoints/get_lead_event_by_id.pipe(1 hunks)packages/tinybird/endpoints/get_lead_events.pipe(1 hunks)packages/tinybird/endpoints/get_sale_event.pipe(1 hunks)packages/tinybird/endpoints/get_webhook_events.pipe(1 hunks)packages/tinybird/endpoints/v2_browsers.pipe(1 hunks)packages/tinybird/endpoints/v2_cities.pipe(1 hunks)packages/tinybird/endpoints/v2_continents.pipe(1 hunks)packages/tinybird/endpoints/v2_count.pipe(1 hunks)packages/tinybird/endpoints/v2_countries.pipe(1 hunks)packages/tinybird/endpoints/v2_customer_events.pipe(1 hunks)packages/tinybird/endpoints/v2_devices.pipe(1 hunks)packages/tinybird/endpoints/v2_events.pipe(1 hunks)packages/tinybird/endpoints/v2_os.pipe(1 hunks)packages/tinybird/endpoints/v2_referer_urls.pipe(1 hunks)packages/tinybird/endpoints/v2_referers.pipe(1 hunks)packages/tinybird/endpoints/v2_regions.pipe(1 hunks)packages/tinybird/endpoints/v2_timeseries.pipe(1 hunks)packages/tinybird/endpoints/v2_top_links.pipe(1 hunks)packages/tinybird/endpoints/v2_top_partners.pipe(1 hunks)packages/tinybird/endpoints/v2_top_programs.pipe(1 hunks)packages/tinybird/endpoints/v2_top_tags.pipe(1 hunks)packages/tinybird/endpoints/v2_top_urls.pipe(1 hunks)packages/tinybird/endpoints/v2_triggers.pipe(1 hunks)packages/tinybird/endpoints/v2_usage.pipe(1 hunks)packages/tinybird/endpoints/v2_utms.pipe(1 hunks)packages/tinybird/materializations/dub_click_events_id_pipe.pipe(1 hunks)packages/tinybird/materializations/dub_click_events_pipe.pipe(1 hunks)packages/tinybird/materializations/dub_lead_events_pipe.pipe(1 hunks)packages/tinybird/materializations/dub_links_metadata_pipe.pipe(1 hunks)packages/tinybird/materializations/dub_regular_links_metadata_pipe.pipe(1 hunks)packages/tinybird/materializations/dub_sale_events_id_pipe.pipe(1 hunks)packages/tinybird/materializations/dub_sale_events_pipe.pipe(1 hunks)packages/tinybird/pipes/all_stats.pipe(1 hunks)packages/tinybird/pipes/coordinates_all.pipe(1 hunks)packages/tinybird/pipes/coordinates_sales.pipe(1 hunks)packages/tinybird/pipes/dub_click_events_pipe_with_domain_key.pipe(1 hunks)packages/tinybird/pipes/dub_regular_links_metadata_pipe.pipe(1 hunks)packages/tinybird/pipes/dub_sale_events_id_pipe.pipe(1 hunks)packages/tinybird/pipes/get_framer_lead_events.pipe(1 hunks)packages/tinybird/pipes/get_lead_event_by_id.pipe(1 hunks)packages/tinybird/pipes/get_lead_events.pipe(1 hunks)packages/tinybird/pipes/get_sale_event.pipe(1 hunks)packages/tinybird/pipes/v2_browsers.pipe(4 hunks)packages/tinybird/pipes/v2_cities.pipe(6 hunks)packages/tinybird/pipes/v2_continents.pipe(4 hunks)packages/tinybird/pipes/v2_count.pipe(7 hunks)packages/tinybird/pipes/v2_countries.pipe(4 hunks)packages/tinybird/pipes/v2_devices.pipe(5 hunks)packages/tinybird/pipes/v2_events.pipe(6 hunks)packages/tinybird/pipes/v2_os.pipe(4 hunks)packages/tinybird/pipes/v2_referer_urls.pipe(5 hunks)packages/tinybird/pipes/v2_referers.pipe(4 hunks)packages/tinybird/pipes/v2_regions.pipe(1 hunks)packages/tinybird/pipes/v2_timeseries.pipe(4 hunks)packages/tinybird/pipes/v2_top_links.pipe(5 hunks)packages/tinybird/pipes/v2_top_urls.pipe(4 hunks)packages/tinybird/pipes/v2_triggers.pipe(5 hunks)packages/tinybird/pipes/v2_utms.pipe(4 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-06-06T07:59:03.120Z
Learnt from: devkiran
PR: dubinc/dub#2177
File: apps/web/lib/api/links/bulk-create-links.ts:66-84
Timestamp: 2025-06-06T07:59:03.120Z
Learning: In apps/web/lib/api/links/bulk-create-links.ts, the team accepts the risk of potential undefined results from links.find() operations when building invalidLinks arrays, because existing links are fetched from the database based on the input links, so matches are expected to always exist.
Applied to files:
apps/web/lib/api/links/bulk-delete-links.ts
🧬 Code graph analysis (13)
apps/web/app/(ee)/api/cron/cleanup/link-retention/route.ts (1)
apps/web/lib/tinybird/record-link.ts (1)
recordLink(81-98)
apps/web/lib/tinybird/record-link.ts (2)
apps/web/lib/tinybird/client.ts (1)
tbNew(9-12)apps/web/lib/api/links/utils/transform-link.ts (1)
ExpandedLink(8-17)
apps/web/lib/tinybird/record-lead.ts (2)
apps/web/lib/tinybird/client.ts (2)
tb(3-6)tbNew(9-12)apps/web/lib/zod/schemas/leads.ts (1)
leadEventSchemaTB(98-107)
apps/web/lib/tinybird/log-import-error.ts (2)
apps/web/lib/tinybird/client.ts (2)
tb(3-6)tbNew(9-12)apps/web/lib/zod/schemas/import-error-log.ts (1)
importErrorLogSchema(3-23)
apps/web/lib/tinybird/record-click-zod.ts (1)
apps/web/lib/tinybird/client.ts (2)
tb(3-6)tbNew(9-12)
apps/web/scripts/bulk-delete-links.ts (1)
apps/web/lib/tinybird/record-link.ts (1)
recordLink(81-98)
apps/web/lib/actions/partners/revoke-program-invite.ts (1)
apps/web/lib/tinybird/record-link.ts (1)
recordLink(81-98)
apps/web/lib/tinybird/record-sale.ts (2)
apps/web/lib/tinybird/client.ts (2)
tb(3-6)tbNew(9-12)apps/web/lib/zod/schemas/sales.ts (1)
saleEventSchemaTB(120-137)
apps/web/lib/api/links/delete-link.ts (1)
apps/web/lib/tinybird/record-link.ts (1)
recordLink(81-98)
apps/web/lib/tinybird/record-webhook-event.ts (2)
apps/web/lib/tinybird/client.ts (2)
tb(3-6)tbNew(9-12)apps/web/lib/zod/schemas/webhooks.ts (1)
webhookEventSchemaTB(37-50)
apps/web/lib/tinybird/log-conversion-events.ts (1)
apps/web/lib/tinybird/client.ts (2)
tb(3-6)tbNew(9-12)
apps/web/lib/api/audit-logs/record-audit-log.ts (2)
apps/web/lib/tinybird/client.ts (1)
tbNew(9-12)apps/web/lib/api/audit-logs/schemas.ts (1)
auditLogSchemaTB(15-29)
apps/web/lib/api/links/bulk-delete-links.ts (1)
apps/web/lib/tinybird/record-link.ts (1)
recordLink(81-98)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
|
@coderabbitai addressed some of your feedback, can you do a final review, only focusing on the main codebase (ignore packages/tinybird changes) |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 74202 @steven-tey, I've completed the final review of the main codebase changes. The migration looks well-structured overall! Here's my assessment: ✅ Strengths
📝 Migration TODOs to TrackYou have 13 TODO comments marked "Remove after Tinybird migration" that should be tracked. Consider creating a follow-up issue to clean these up: Core migration files:
💡 Minor Observations
The migration implementation looks solid and ready to go! 🚀 |
Summary by CodeRabbit
New Features
Performance
Chores