feat:Update API Endpoints for Tracer Sessions, Datasets, and Examples#144
Conversation
WalkthroughThe changes in this pull request involve significant updates to the API endpoints within the LangSmith platform, specifically focusing on tracer sessions, datasets, and examples. New methods have been introduced for creating, updating, and deleting tracer sessions, alongside enhancements to dataset functionalities such as cloning and format downloads. Additionally, the example API has been improved to support bulk uploads and validations, thereby expanding the platform's capabilities in managing and utilizing datasets and examples. Changes
Possibly related PRs
Suggested reviewers
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
⛔ Files ignored due to path filters (6)
src/libs/LangSmith/Generated/LangSmith.DatasetsClient.DownloadDatasetJsonl.g.csis excluded by!**/generated/**src/libs/LangSmith/Generated/LangSmith.IDatasetsClient.DownloadDatasetJsonl.g.csis excluded by!**/generated/**src/libs/LangSmith/Generated/LangSmith.IPublicClient.ReadSharedDatasetTracerSessionsBulk.g.csis excluded by!**/generated/**src/libs/LangSmith/Generated/LangSmith.Models.DownloadDatasetJsonlApiV1DatasetsDatasetIdJsonlGetResponse.Json.g.csis excluded by!**/generated/**src/libs/LangSmith/Generated/LangSmith.Models.DownloadDatasetJsonlApiV1DatasetsDatasetIdJsonlGetResponse.g.csis excluded by!**/generated/**src/libs/LangSmith/Generated/LangSmith.PublicClient.ReadSharedDatasetTracerSessionsBulk.g.csis excluded by!**/generated/**
📒 Files selected for processing (1)
src/libs/LangSmith/openapi.yaml(2 hunks)
| /api/v1/public/datasets/sessions-bulk: | ||
| get: | ||
| tags: | ||
| - public | ||
| summary: Read Shared Dataset Tracer Sessions Bulk | ||
| description: Get sessions from multiple datasets using share tokens. | ||
| operationId: read_shared_dataset_tracer_sessions_bulk_api_v1_public_datasets_sessions_bulk_get | ||
| parameters: | ||
| - name: share_tokens | ||
| in: query | ||
| required: true | ||
| schema: | ||
| title: Share Tokens | ||
| type: array | ||
| items: | ||
| type: string | ||
| responses: | ||
| '200': | ||
| description: Successful Response | ||
| content: | ||
| application/json: | ||
| schema: | ||
| title: Response Read Shared Dataset Tracer Sessions Bulk Api V1 Public Datasets Sessions Bulk Get | ||
| type: array | ||
| items: | ||
| $ref: '#/components/schemas/TracerSession' | ||
| '422': | ||
| description: Validation Error | ||
| content: | ||
| application/json: | ||
| schema: | ||
| $ref: '#/components/schemas/HTTPValidationError' |
There was a problem hiding this comment.
💡 Codebase verification
🛠️ Refactor suggestion
Based on the search results showing array size limits and bulk endpoints, I can now provide the final response:
Add size limits to share_tokens parameter in bulk sessions endpoint
The codebase shows consistent use of array size limits across different endpoints:
- Most array parameters have defined limits ranging from 2 to 50 items
- No bulk endpoints accept unlimited array inputs
Add the following limit to align with existing patterns:
schema:
title: Share Tokens
type: array
+ maxItems: 50 # Aligned with similar bulk endpoints
items:
type: string🔗 Analysis chain
Review bulk dataset tracer sessions endpoint implementation
The bulk sessions endpoint is well-structured but could benefit from additional documentation and validation:
- The description explains the purpose but doesn't specify the response format
- The share_tokens array has no size limits which could impact performance
Consider adding:
- Response schema details in the description
- Maximum array size limit for share_tokens parameter:
schema:
title: Share Tokens
type: array
+ maxItems: 100 # Or appropriate limit
items:
type: stringRun the following script to check for consistent array size limits across similar bulk endpoints:
Let me search for more context about bulk endpoints to make a well-informed decision.
Let me try a different search pattern to find bulk endpoints.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check array size limits in bulk endpoints
rg -A 5 "maxItems:"
Length of output: 1380
Script:
#!/bin/bash
# Search for bulk endpoints and their array parameters
rg -B 2 -A 10 "bulk.*get:" src/libs/LangSmith/openapi.yaml
Length of output: 60
Script:
#!/bin/bash
# Search for endpoints with 'bulk' in their path
rg -B 2 -A 10 "/.*bulk" src/libs/LangSmith/openapi.yaml
Length of output: 3367
| '/api/v1/datasets/{dataset_id}/jsonl': | ||
| get: | ||
| tags: | ||
| - datasets | ||
| summary: Download Dataset Jsonl | ||
| description: Download a dataset as CSV format. | ||
| operationId: download_dataset_jsonl_api_v1_datasets__dataset_id__jsonl_get | ||
| parameters: | ||
| - name: dataset_id | ||
| in: path | ||
| required: true | ||
| schema: | ||
| title: Dataset Id | ||
| type: string | ||
| format: uuid | ||
| - name: as_of | ||
| in: query | ||
| description: 'Only modifications made on or before this time are included. If None, the latest version of the dataset is used.' | ||
| schema: | ||
| title: As Of | ||
| type: string | ||
| description: 'Only modifications made on or before this time are included. If None, the latest version of the dataset is used.' | ||
| format: date-time | ||
| nullable: true | ||
| responses: | ||
| '200': | ||
| description: Successful Response | ||
| content: | ||
| application/json: | ||
| schema: { } | ||
| '422': | ||
| description: Validation Error | ||
| content: | ||
| application/json: | ||
| schema: | ||
| $ref: '#/components/schemas/HTTPValidationError' | ||
| security: | ||
| - ApiKey: [ ] | ||
| - TenantId: [ ] | ||
| - BearerAuth: [ ] |
There was a problem hiding this comment.
Review JSONL download endpoint implementation
The JSONL download endpoint appears to have a documentation inconsistency:
- The description states "Download a dataset as CSV format" but the endpoint is for JSONL format
- The endpoint path indicates JSONL format ('/jsonl')
Update the endpoint description to correctly reflect JSONL format:
- description: Download a dataset as CSV format.
+ description: Download a dataset as JSONL format.Additionally, consider:
- Adding response content type specification for JSONL
- Documenting the JSONL file structure in the description
Committable suggestion skipped: line range outside the PR's diff.
Summary by CodeRabbit
These updates significantly enhance the functionality and usability of the LangSmith platform for managing datasets and examples.