Skip to content

Conversation

@amotl
Copy link
Member

@amotl amotl commented Jul 20, 2025

About

For Text-to-SQL purposes, exercise the langchain-mcp-adapters package together with the CrateDB MCP Server, also to validate its OCI standard image published to GHCR.

References

Backlog

  • Software tests
  • Documentation

@coderabbitai
Copy link

coderabbitai bot commented Jul 20, 2025

Warning

Rate limit exceeded

@amotl has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 9 minutes and 59 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 286fcde and 57c2235.

📒 Files selected for processing (4)
  • .github/workflows/ml-langchain.yml (2 hunks)
  • topic/machine-learning/llm-langchain/README.md (4 hunks)
  • topic/machine-learning/llm-langchain/agent_with_mcp.py (1 hunks)
  • topic/machine-learning/llm-langchain/requirements.txt (1 hunks)

Walkthrough

A new agent integration script demonstrates using LangChain/LangGraph with a CrateDB MCP server, including setup instructions. The requirements file adds dependencies for LangChain MCP adapters and LangGraph. The workflow adds a CrateDB MCP service and matrix for versioning. Database initialization is scripted with a new SQL file and test fixture. The .gitignore is adjusted to include the SQL initializer.

Changes

File(s) Change Summary
agent_with_mcp.py New script: async agent using LangChain/LangGraph with CrateDB MCP server; includes setup and agent invocation.
requirements.txt Added dependencies: cratedb-about==0.0.6, langchain-mcp-adapters<0.2, langgraph<0.6.
.github/workflows/ml-langchain.yml Added cratedb-mcp service, new matrix variable for MCP version, and related environment setup.
test.py Added pytest fixture init_database to initialize DB from init.sql; complements existing reset fixture.
.gitignore Updated: allows tracking of init.sql while still ignoring other .sql files.
init.sql New SQL script: drops/creates time_series_data table and populates it with sensor data.
README.md Updated to include LangGraph, new example, setup instructions, and references for LangGraph and MCP integration.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant AgentScript
    participant MCPClient
    participant LangGraphAgent
    participant OpenAIModel

    User->>AgentScript: Run main()
    AgentScript->>MCPClient: Connect to MCP server
    MCPClient-->>AgentScript: Return available tools
    AgentScript->>LangGraphAgent: Instantiate with tools and GPT-4.1
    AgentScript->>LangGraphAgent: Query: "What is the average value of sensor 1?"
    LangGraphAgent->>OpenAIModel: Process query
    OpenAIModel-->>LangGraphAgent: Return response
    LangGraphAgent-->>AgentScript: Return final response
    AgentScript->>User: Print query and response
Loading

Estimated code review effort

3 (~40 minutes) — Moderate complexity with new async agent logic, workflow and service setup, database initialization scripts, and test fixture additions.

Possibly related PRs

  • Chatbot assistant on AWS #989: Adds an AWS Bedrock LLM agent integrated with CrateDB MCP for natural language querying and diagnostics; related in demonstrating CrateDB MCP-based LLM agents but differs in implementation and LLM provider.

Suggested reviewers

  • kneth
  • hlcianfagna

Poem

In the warren where data flows free,
A rabbit scripts with glee—
Tables set, agents connect,
MCP and LangChain intersect!
With SQL seeds and fixtures neat,
The test suite hops to a steady beat.
🐇✨ Data magic, mission complete!

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch langchain-mcp

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
.github/workflows/ml-langchain.yml (1)

5-5: branches: ~ is not a valid filter – drops parsing in Actions.

YAML ~ means null, but the pull_request.branches key expects either an array of branch globs or omission.
Keeping it as null causes the workflow definition to be rejected by GitHub.

-    branches: ~
+#   ─ Option A: watch every branch – simply delete the key
+#   ─ Option B: explicit wildcard
+    # branches:
+    #   - '*'
🧹 Nitpick comments (1)
.github/workflows/ml-langchain.yml (1)

41-46: Matrix blow-up & readability

Adding cratedb-mcp-version creates a full Cartesian product with every Python + CrateDB version.
Right now each dimension has only one value, but if more are appended later the job count will explode.

Recommend:

strategy:
  matrix:
    include:
      - os: ubuntu-latest
        python-version: '3.11'
        cratedb-version: nightly
        cratedb-mcp-version: pr-50

Keeps the job count explicit and avoids surprises.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f3f46d6 and f4e1e40.

📒 Files selected for processing (6)
  • .github/workflows/ml-langchain.yml (2 hunks)
  • topic/machine-learning/llm-langchain/.gitignore (1 hunks)
  • topic/machine-learning/llm-langchain/agent_with_mcp.py (1 hunks)
  • topic/machine-learning/llm-langchain/init.sql (1 hunks)
  • topic/machine-learning/llm-langchain/requirements.txt (1 hunks)
  • topic/machine-learning/llm-langchain/test.py (1 hunks)
✅ Files skipped from review due to trivial changes (2)
  • topic/machine-learning/llm-langchain/.gitignore
  • topic/machine-learning/llm-langchain/init.sql
🚧 Files skipped from review as they are similar to previous changes (3)
  • topic/machine-learning/llm-langchain/requirements.txt
  • topic/machine-learning/llm-langchain/test.py
  • topic/machine-learning/llm-langchain/agent_with_mcp.py
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: amotl
PR: crate/cratedb-examples#1032
File: topic/machine-learning/llama-index/demo_nlsql.py:28-29
Timestamp: 2025-07-20T00:14:38.691Z
Learning: In demonstration and example code within the cratedb-examples repository, prefer simpler code without extensive error handling to maintain clarity and readability of the examples.
.github/workflows/ml-langchain.yml (1)
Learnt from: amotl
PR: crate/cratedb-examples#937
File: topic/machine-learning/llm-langchain/requirements-dev.txt:2-2
Timestamp: 2025-05-12T20:10:38.614Z
Learning: The cratedb-toolkit package supports various extras including "io", "datasets", "influxdb", "mongodb", "testing", and many others.

@amotl amotl marked this pull request as ready for review July 20, 2025 12:05
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
topic/machine-learning/llm-langchain/README.md (4)

10-15: Clarify relationship between LangChain and LangGraph

The paragraph abruptly switches from LangChain to LangGraph without an explicit transition or header. Consider either:

  1. Adding a short sentence explaining why LangGraph is relevant in this context (e.g., “Because the new example relies on LangGraph, …”), or
  2. Moving the LangGraph blurb into its own subsection (e.g., “### About LangGraph”).

This will help readers who are skimming understand the distinction at a glance.


21-22: Maintain bullet-list ordering

The new “Text-to-SQL” item is great, but the list is now semi-random (chatbots, document QA, Text-to-SQL, …). For discoverability, consider grouping by domain or alphabetically.


93-98: Add a clickable link for the new agent_with_mcp.py example

All other examples use markdown links or badge sets; this one is plain text. For consistency (and reader convenience), wrap the filename in a GitHub link and optionally add a quick-launch badge.


171-175: Minor phrasing nit

“Regeneration of the Jupyter Notebook” → “regeneration of Jupyter notebooks” (plural, generic).

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f4e1e40 and 14cf5ba.

📒 Files selected for processing (7)
  • .github/workflows/ml-langchain.yml (2 hunks)
  • topic/machine-learning/llm-langchain/.gitignore (1 hunks)
  • topic/machine-learning/llm-langchain/README.md (4 hunks)
  • topic/machine-learning/llm-langchain/agent_with_mcp.py (1 hunks)
  • topic/machine-learning/llm-langchain/init.sql (1 hunks)
  • topic/machine-learning/llm-langchain/requirements.txt (1 hunks)
  • topic/machine-learning/llm-langchain/test.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
  • topic/machine-learning/llm-langchain/.gitignore
  • topic/machine-learning/llm-langchain/test.py
  • topic/machine-learning/llm-langchain/requirements.txt
  • topic/machine-learning/llm-langchain/init.sql
  • topic/machine-learning/llm-langchain/agent_with_mcp.py
  • .github/workflows/ml-langchain.yml
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: amotl
PR: crate/cratedb-examples#1032
File: topic/machine-learning/llama-index/demo_nlsql.py:28-29
Timestamp: 2025-07-20T00:14:38.691Z
Learning: In demonstration and example code within the cratedb-examples repository, prefer simpler code without extensive error handling to maintain clarity and readability of the examples.
topic/machine-learning/llm-langchain/README.md (1)
Learnt from: amotl
PR: crate/cratedb-examples#1032
File: topic/machine-learning/llama-index/demo_nlsql.py:28-29
Timestamp: 2025-07-20T00:14:38.691Z
Learning: In demonstration and example code within the cratedb-examples repository, prefer simpler code without extensive error handling to maintain clarity and readability of the examples.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Python: 3.10 CrateDB: nightly on ubuntu-latest
  • GitHub Check: Python: 3.12 CrateDB: nightly on ubuntu-latest
  • GitHub Check: Python: 3.11 CrateDB: nightly on ubuntu-latest
  • GitHub Check: Python: 3.13 CrateDB: nightly on ubuntu-latest
🔇 Additional comments (3)
topic/machine-learning/llm-langchain/README.md (3)

101-105: Mention new dependencies in install instructions

Running agent_with_mcp.py requires langchain-mcp-adapters, langgraph, and uvx (or uvicorn). These are not yet in requirements.txt and the “Install” section does not call them out, which will trip up users following the README verbatim.


179-190: Double-check external links

Quick manual test:
https://academy.langchain.com/courses/ambient-agents/ – OK
https://langchain-ai.github.io/langgraph/ – redirects to 404 at the moment.

If the project switched to a different docs host (https://docs.langgraph.ai), update the link accordingly.


87-92: Broken notebook link

The newly added Vertex-AI notebook path points to
cratedb_rag_customer_support_vertexai.ipynb, but the embedded GitHub and Colab URLs still reference the previous file (conversational_memory.ipynb). Update the two URL fragments so the badges open the intended notebook.

-...(https://github.com/crate/cratedb-examples/blob/main/topic/machine-learning/llm-langchain/conversational_memory.ipynb)
+...(https://github.com/crate/cratedb-examples/blob/main/topic/machine-learning/llm-langchain/cratedb_rag_customer_support_vertexai.ipynb)

Likely an incorrect or invalid review comment.

pip install -U -r requirements.txt
# Start database.
docker run --rm -it --publish=4200:4200 crate/crate:nightly
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A step to execute init.sql should be added here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this. Fixed with 45f0704.

Comment on lines 44 to 46
cratedb-mcp-version: [
'pr-50',
]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be adjusted after the next release of cratedb-mcp.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved with 046d8f7.

}
)
tools = await client.get_tools()
agent = create_react_agent("openai:gpt-4.1", tools)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create_react_agent also accepts a prompt argument. Maybe insert CrateDB's instructions here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improved with 4de0d6e.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
topic/machine-learning/llm-langchain/README.md (3)

10-14: Good call-out of LangGraph – maybe give it its own subsection

The extra sentences are helpful, but the LangGraph paragraph is still tucked inside the LangChain introduction.
Consider breaking it out as its own “## About LangGraph” (or similar) to keep the two frameworks conceptually separate and skimmable.
This is purely editorial, no blocker.


101-105: Minor wording tweak for virtual-env instructions (optional)

Strictly speaking, python -m venv .venv uses the interpreter found first on PATH; some users might have multiple Python versions.
If you want to be explicit, change to python3.11 -m venv … or mention the minimum supported version (e.g. “Python ≥ 3.10”).


138-150: Heading rendered as H1 and flagged by MD026 – drop the trailing period

# Provision database. is interpreted as a level-1 heading with a trailing full-stop, triggering markdown-lint.
Convert it to a normal sentence or lower-level heading without the period:

-# Provision database.
+#### Provision the database
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 14cf5ba and 286fcde.

📒 Files selected for processing (4)
  • .github/workflows/ml-langchain.yml (2 hunks)
  • topic/machine-learning/llm-langchain/README.md (4 hunks)
  • topic/machine-learning/llm-langchain/agent_with_mcp.py (1 hunks)
  • topic/machine-learning/llm-langchain/requirements.txt (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • topic/machine-learning/llm-langchain/requirements.txt
  • .github/workflows/ml-langchain.yml
  • topic/machine-learning/llm-langchain/agent_with_mcp.py
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: amotl
PR: crate/cratedb-examples#1033
File: topic/machine-learning/llm-langchain/README.md:138-156
Timestamp: 2025-07-21T18:46:07.502Z
Learning: In CrateDB MCP server configuration, the correct environment variable name is `CRATEDB_MCP_TRANSPORT`, not `CRATEDB_MCP_ADAPTER_TRANSPORT`. This variable is used throughout the CrateDB MCP codebase to specify the transport protocol.
Learnt from: amotl
PR: crate/cratedb-examples#1032
File: topic/machine-learning/llama-index/demo_nlsql.py:28-29
Timestamp: 2025-07-20T00:14:38.710Z
Learning: In demonstration and example code within the cratedb-examples repository, prefer simpler code without extensive error handling to maintain clarity and readability of the examples.
topic/machine-learning/llm-langchain/README.md (3)

Learnt from: amotl
PR: #1033
File: topic/machine-learning/llm-langchain/README.md:138-156
Timestamp: 2025-07-21T18:46:07.502Z
Learning: In CrateDB MCP server configuration, the correct environment variable name is CRATEDB_MCP_TRANSPORT, not CRATEDB_MCP_ADAPTER_TRANSPORT. This variable is used throughout the CrateDB MCP codebase to specify the transport protocol.

Learnt from: amotl
PR: #1033
File: topic/machine-learning/llm-langchain/README.md:138-156
Timestamp: 2025-07-21T18:46:07.502Z
Learning: The uvx command is the launcher program provided by the uv package manager, used to run Python tools in isolated environments without installing them globally. It is not part of uvicorn-ext.

Learnt from: amotl
PR: #1032
File: topic/machine-learning/llama-index/demo_nlsql.py:28-29
Timestamp: 2025-07-20T00:14:38.710Z
Learning: In demonstration and example code within the cratedb-examples repository, prefer simpler code without extensive error handling to maintain clarity and readability of the examples.

🪛 markdownlint-cli2 (0.17.2)
topic/machine-learning/llm-langchain/README.md

140-140: Trailing punctuation in heading
Punctuation: '.'

(MD026, no-trailing-punctuation)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Python: 3.10 CrateDB: nightly on ubuntu-latest
  • GitHub Check: Python: 3.13 CrateDB: nightly on ubuntu-latest
  • GitHub Check: Python: 3.11 CrateDB: nightly on ubuntu-latest
  • GitHub Check: Python: 3.12 CrateDB: nightly on ubuntu-latest
🔇 Additional comments (6)
topic/machine-learning/llm-langchain/README.md (6)

21-21: 👍 Nice to see Text-to-SQL called out explicitly
No issues – this bullet rounds out the list nicely.


87-92: Vertex AI RAG example reference looks correct
Link target and framing read well.


93-98: Check file path for the new agent_with_mcp.py link

Double-check that the example file really lives next to the README (not in a sub-folder). A broken link here would be frustrating for users.


152-161: Instructions are clear – nothing to change
Environment variables and expected output look correct, and CRATEDB_MCP_TRANSPORT matches the upstream codebase.


176-180: Notebook-regeneration note looks fine
No action required.


184-195: Reference section updated – links resolve correctly
All new references render and point to valid targets.

@coderabbitai coderabbitai bot mentioned this pull request Jul 21, 2025
2 tasks
@amotl amotl merged commit 39d5152 into main Jul 21, 2025
5 checks passed
@amotl amotl deleted the langchain-mcp branch July 21, 2025 19:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants