Skip to content

feat(caretaker): implement Cloud Run webhook ingestion service#28015

Open
chadd28 wants to merge 7 commits into
google-gemini:mainfrom
chadd28:feature/caretaker-agent
Open

feat(caretaker): implement Cloud Run webhook ingestion service#28015
chadd28 wants to merge 7 commits into
google-gemini:mainfrom
chadd28:feature/caretaker-agent

Conversation

@chadd28

@chadd28 chadd28 commented Jun 18, 2026

Copy link
Copy Markdown

Summary

Implements the Cloud Run Webhook Ingestion Service for the Caretaker Agent. The service acts as an entry point for GitHub webhooks, verifies incoming payload signatures, stores new issue entries using Firestore transactions, and publishes sanitized issue metadata to a GCP Pub/Sub topic for downstream processing.

Details

  • Server (server.ts): The main Express server (to be hosted on Cloud Run) that receives GitHub issues.opened events, validates their signature, adds the issue to Firestore, and publishes the issue details to Pub/Sub.
  • Auth Verification (auth/github.ts): HMAC SHA-256 signature verification helper using node:crypto and a secure timing-safe equality check.
  • Store (db/issuesStore.ts): Initializes new issues in Firestore using a transaction.
  • Tests: Added issuesStore.test.ts (Firestore mock tests) and github.test.ts (HMAC signature verification tests).

How to Validate

Run unit tests in the service directory:

cd tools/caretaker-agent/cloudrun/ingestion-service
npm install
npx vitest run

Pre-Merge Checklist

  • Updated relevant documentation and README (if needed)
  • Added/updated tests (if needed)
  • Noted breaking changes (if any)
  • Validated on required platforms/methods:
    • MacOS
      • npm run
      • npx
      • Docker
      • Podman
      • Seatbelt
    • Windows
      • npm run
      • npx
      • Docker
    • Linux
      • npm run
      • npx
      • Docker

@galz10

@github-actions github-actions Bot added the size/l A large sized PR label Jun 18, 2026
@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown

📊 PR Size: size/L

  • Lines changed: 441
  • Additions: +441
  • Deletions: -0
  • Files changed: 9

@chadd28 chadd28 marked this pull request as ready for review June 18, 2026 19:38
@chadd28 chadd28 requested a review from a team as a code owner June 18, 2026 19:38
@gemini-code-assist

Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new ingestion service for the Caretaker Agent, designed to run on Cloud Run. The service acts as a secure gateway for GitHub webhooks, handling event validation, deduplication via Firestore transactions, and message queuing through Pub/Sub. This infrastructure enables automated downstream triage workflows for incoming GitHub issues.

Highlights

  • Webhook Ingestion Service: Implemented a new Express-based service to receive and process GitHub 'issues.opened' webhooks.
  • Security & Verification: Added HMAC SHA-256 signature verification using timing-safe equality checks to ensure payload authenticity.
  • Data Persistence: Integrated Firestore transactions to reliably store issue metadata and prevent duplicate entries.
  • Downstream Integration: Configured Pub/Sub publishing to forward sanitized issue data for further processing.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new Express-based ingestion service for a triage worker, featuring GitHub webhook signature verification, Firestore storage for tracking issues, and Pub/Sub integration. The review feedback highlights three critical security and reliability improvements: validating the signature length and payload type in the GitHub auth module to prevent crashes and DoS attacks, escaping the issue body to mitigate prompt injection vulnerabilities, and checking the return value of the issue creation transaction to prevent duplicate Pub/Sub messages.

Comment thread tools/caretaker-agent/cloudrun/ingestion-service/auth/github.ts Outdated
}

// Payload preprocessing
const sanitizedBody = `<untrusted_context>\n${payload.issue?.body || ''}\n</untrusted_context>`;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Wrapping untrusted user input directly in <untrusted_context> tags without escaping or sanitizing the input makes the system vulnerable to prompt injection. An attacker could include </untrusted_context> in their issue body to escape the context block and inject malicious instructions.

Sanitize or escape any occurrences of </untrusted_context> in the issue body before wrapping it.

  const rawBody = payload.issue?.body || '';
  const escapedBody = rawBody.replace(/<\/untrusted_context>/g, '[escaped_untrusted_context_tag]');
  const sanitizedBody = `\<untrusted_context\>\n\${escapedBody}\n\</untrusted_context\>`;

Comment thread tools/caretaker-agent/cloudrun/ingestion-service/server.ts Outdated
@gemini-cli gemini-cli Bot added the status/need-issue Pull requests that need to have an associated issue. label Jun 18, 2026
@chadd28 chadd28 force-pushed the feature/caretaker-agent branch from ef27d32 to 5650049 Compare June 22, 2026 21:51
COPY . .
EXPOSE 8080
CMD ["npx", "tsx", "server.ts"]

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P1] Production anti-pattern.
Running tsx (or ts-node) in a production container introduces significant memory overhead and startup latency. Add a "build": "tsc" script to package.json and run the compiled JavaScript here instead.

Suggested change
RUN npm run build
CMD ["node", "dist/server.js"]


// Publish to Pub/Sub
const dataBuffer = Buffer.from(JSON.stringify(processedData));
const messageId = await topic.publishMessage({ data: dataBuffer });

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P0] Dual-write failure state permanently drops retried webhooks.
If the Firestore write succeeds but the Pub/Sub publish fails, the server returns a 500 error, and GitHub will retry. On the retry, createIssue returns false (since the document already exists), causing the endpoint to return 200 early and skip the Pub/Sub publish entirely. The issue will never be processed downstream. You should publish to Pub/Sub even if the DB document exists (assuming downstream is idempotent) or rely entirely on Pub/Sub and let the downstream worker deduplicate.

});

app.post('/webhook', async (req, res) => {
const signature = req.headers['x-hub-signature-256'] as string | undefined;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P2] Unsafe header cast.
Express headers can be an array (string[]) if multiple headers with the same name are sent. If an array is passed, downstream checks like signature.length will evaluate the array length rather than string length.

Suggested change
const signature = req.headers['x-hub-signature-256'] as string | undefined;
const header = req.headers['x-hub-signature-256'];
const signature = Array.isArray(header) ? header[0] : header;

} from '@google-cloud/firestore';

export class IssuesStore {
private db: Firestore;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P3] Missing readonly modifier.
These properties are only set in the constructor. Mark them as readonly to enforce immutability.

Suggested change
private db: Firestore;
private readonly db: Firestore;
private readonly collectionName: string;

const issuesStore = new IssuesStore(db, collectionName);

// Middleware: read incoming JSON payloads as raw Buffer bytes
app.use(express.raw({ type: 'application/json' }));

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P3] Missing payload limit.
While GitHub webhooks are generally small, setting an explicit limit is best practice for documenting expectations and preventing payload-based denial of service.

Suggested change
app.use(express.raw({ type: 'application/json' }));
app.use(express.raw({ type: 'application/json', limit: '1mb' }));

let payload: GitHubWebhookPayload;
try {
payload = JSON.parse(req.body.toString()) as GitHubWebhookPayload;
} catch {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P2] Blind type casting.
The explicit cast as GitHubWebhookPayload bypasses runtime safety. At minimum, we should validate the presence of payload.issue.number and payload.repository.full_name before assuming the structure exists to avoid unexpected undefined errors downstream.


// Payload preprocessing
const sanitizedBody = `<untrusted_context>\n${payload.issue?.body || ''}\n</untrusted_context>`;
const processedData = {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P2] Misleading variable name / Missing sanitization.
This is not actually sanitized. If a user maliciously includes </untrusted_context> in their GitHub issue description, they will break out of the LLM context wrapper downstream. Rename this to wrappedBody to be accurate, or implement actual tag escaping/stripping.

@galz10 galz10 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P0] Missing core unit tests.
The PR introduces complex routing, orchestrating validation, DB writes, and Pub/Sub publishing in server.ts without a corresponding server.test.ts. This violates the testing standard criteria. Please add tests that mock the Firestore and Pub/Sub clients to verify the endpoint behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/l A large sized PR status/need-issue Pull requests that need to have an associated issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants