Skip to content

fix(security): validate GitHub owner/repo format before gh CLI call#198

Merged
BillChirico merged 5 commits intomainfrom
fix/issue-160
Mar 2, 2026
Merged

fix(security): validate GitHub owner/repo format before gh CLI call#198
BillChirico merged 5 commits intomainfrom
fix/issue-160

Conversation

@BillChirico
Copy link
Collaborator

Summary

Fixes #160 — GitHub API path traversal via malformed owner/repo config values.

Problem

src/modules/githubFeed.js interpolated owner and repo directly into the gh api path without validation. While execFile (not exec) prevents shell injection, a malformed string like ../../users/admin stored in config could make the bot call arbitrary GitHub API endpoints using its own gh credentials.

Changes

src/modules/githubFeed.js

  • Added VALID_GH_NAME regex (/^[a-zA-Z0-9._-]+$/) — strict allowlist for GitHub name segments
  • Added isValidGhRepo(owner, repo) helper — validates both segments, rejects non-strings, empty, slashes, metacharacters
  • Guard in fetchRepoEvents() — returns [] and logs a warning instead of calling gh if validation fails
  • Strengthened guard in pollGuildFeed() — uses isValidGhRepo instead of the weaker !owner || !repo check

tests/modules/githubFeed.test.js

  • 19 new tests covering: valid names, path traversal, slashes, empty strings, non-strings, spaces, shell metacharacters, CLI guard behavior, and warn() audit trail

Test Results

✓ tests/modules/githubFeed.test.js (55 tests) — all pass
Lint: 0 errors, 1 pre-existing warning (unrelated React key)

Prevents API path traversal by validating owner/repo segments against
a strict allowlist regex before interpolating them into the gh CLI
invocation.

Adds:
- VALID_GH_NAME regex (/^[a-zA-Z0-9._-]+$/)
- isValidGhRepo() helper (exported for testing)
- Guard in fetchRepoEvents() — returns [] and warns on invalid input
- Strengthened guard in pollGuildFeed() split logic

Fixes #160
Covers isValidGhRepo(), VALID_GH_NAME regex, and fetchRepoEvents()
validation guard introduced in fix for #160.

19 new tests verify:
- Valid alphanumeric/dot/hyphen/underscore names pass
- Path traversal (../../etc/passwd) is rejected at both entry points
- Slashes, empty strings, non-strings, spaces all rejected
- Shell metacharacters (; && $()) blocked
- gh CLI is NOT invoked when validation fails
- warn() fires with the invalid values (observable audit trail)
- Valid owner/repo still reach gh CLI unchanged
Copilot AI review requested due to automatic review settings March 2, 2026 04:08
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 2, 2026

Warning

Rate limit exceeded

@BillChirico has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 10 minutes and 57 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between a710098 and 14a934b.

📒 Files selected for processing (2)
  • src/modules/githubFeed.js
  • tests/modules/githubFeed.test.js
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/issue-160

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses GitHub issue #160 by adding input validation for the owner and repo fields before they are interpolated into a gh api CLI path in src/modules/githubFeed.js. It introduces a regex constant, a validation helper, and guards in both fetchRepoEvents and pollGuildFeed, along with 19 new tests.

Changes:

  • Added VALID_GH_NAME regex and isValidGhRepo() helper to validate GitHub name segments before CLI use
  • Added validation guards in fetchRepoEvents() and pollGuildFeed() that reject and log invalid inputs
  • Added 19 test cases in the test suite covering valid names, traversal patterns, edge types, and CLI guard behavior

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
src/modules/githubFeed.js Adds VALID_GH_NAME regex, isValidGhRepo() helper, and validation guards in fetchRepoEvents and pollGuildFeed
tests/modules/githubFeed.test.js Adds warn import and 19 new test cases covering the new validation logic

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@greptile-apps
Copy link

greptile-apps bot commented Mar 2, 2026

Greptile Summary

This PR successfully fixes the GitHub API path traversal vulnerability (issue #160) by adding strict input validation before calling the gh CLI. The fix prevents malicious config values like ../../users/admin from being used in API paths.

Key changes:

  • Added VALID_GH_NAME regex requiring names start with alphanumeric characters
  • Created isValidGhRepo() helper that validates both owner and repo segments
  • Added guards in fetchRepoEvents() and pollGuildFeed() that reject invalid inputs with warning logs
  • Comprehensive test suite with 19 new tests covering path traversal, shell metacharacters, and edge cases

Security validation: The regex correctly blocks path traversal (.../), slashes, and shell metacharacters while allowing legitimate GitHub repo names. Using execFile (not exec) already prevents shell injection, so this adds defense-in-depth against API path manipulation.

Note: src/commands/github.js uses a weaker validation pattern that should be updated for consistency (see inline comment).

Confidence Score: 4/5

  • Safe to merge - the path traversal vulnerability is effectively blocked by the new validation logic
  • Strong security fix with excellent test coverage. Score of 4 (not 5) due to validation inconsistency between githubFeed.js and github.js that could cause UX confusion when repos are accepted via command but fail silently during polling. The security issue itself is fully resolved.
  • src/commands/github.js should be updated to use the same stricter validation regex for consistency (non-blocking)

Important Files Changed

Filename Overview
src/modules/githubFeed.js Added robust input validation to prevent path traversal attacks via GitHub CLI. Validation regex correctly rejects malformed owner/repo names including path traversal attempts.
tests/modules/githubFeed.test.js Comprehensive test coverage for new validation logic with 19 new security-focused tests covering path traversal, type validation, and edge cases.

Last reviewed commit: 14a934b

Copilot AI review requested due to automatic review settings March 2, 2026 11:38
@BillChirico BillChirico merged commit 76c5a35 into main Mar 2, 2026
9 of 13 checks passed
@BillChirico BillChirico deleted the fix/issue-160 branch March 2, 2026 11:38
BillChirico added a commit that referenced this pull request Mar 2, 2026
…198)

* fix(security): validate GitHub owner/repo format before gh CLI call

Prevents API path traversal by validating owner/repo segments against
a strict allowlist regex before interpolating them into the gh CLI
invocation.

Adds:
- VALID_GH_NAME regex (/^[a-zA-Z0-9._-]+$/)
- isValidGhRepo() helper (exported for testing)
- Guard in fetchRepoEvents() — returns [] and warns on invalid input
- Strengthened guard in pollGuildFeed() split logic

Fixes #160

* test(security): add validation tests for GitHub owner/repo format

Covers isValidGhRepo(), VALID_GH_NAME regex, and fetchRepoEvents()
validation guard introduced in fix for #160.

19 new tests verify:
- Valid alphanumeric/dot/hyphen/underscore names pass
- Path traversal (../../etc/passwd) is rejected at both entry points
- Slashes, empty strings, non-strings, spaces all rejected
- Shell metacharacters (; && $()) blocked
- gh CLI is NOT invoked when validation fails
- warn() fires with the invalid values (observable audit trail)
- Valid owner/repo still reach gh CLI unchanged

* fix(security): reject pure-dot owner/repo names to prevent path traversal

* test(githubFeed): add tests for pure-dot path traversal bypass

---------

Co-authored-by: Bill Chirico <[email protected]>
BillChirico added a commit that referenced this pull request Mar 2, 2026
* security: escape user content in triage prompt delimiters (#164)

Add escapePromptDelimiters() to HTML-encode < and > in user-supplied
message content before it is inserted between XML-style section tags
in the LLM prompt.

Without escaping, a crafted message containing the literal text
`</messages-to-evaluate>` could break out of the user-content section
and inject attacker-controlled instructions into the prompt structure.

Changes:
- Add escapePromptDelimiters(text) utility exported from triage-prompt.js
- Apply escape to m.content and m.replyTo.content in buildConversationText()
- Add 13 new tests covering the escape function and injection scenarios

Closes #164

* security: escape & chars and author fields in prompt delimiters

* fix(security): escape & in prompt delimiters and escape author fields

- Add & → &amp; escape first in escapePromptDelimiters() to prevent
  HTML entity bypass attacks (e.g. &lt;/messages-to-evaluate&gt;)
- Also escape m.author and m.replyTo.author since Discord display
  names are user-controlled and can contain < / > characters

Addresses review feedback on PR #204.

* fix: guard replyTo.content before .slice() to handle null/undefined

* perf: SQL-based conversation pagination + missing DB indexes (#221)

Fixes three performance bottlenecks identified in code review of
recently merged features (PR #121 conversations viewer, PR #190 AI feedback).

## Changes

### migrations/004_performance_indexes.cjs (new)
Four new indexes targeting hot query paths:

- idx_ai_feedback_guild_created (guild_id, created_at DESC)
  getFeedbackTrend() and getRecentFeedback() filtered by guild_id
  AND created_at but only had a single-column guild_id index, forcing
  a full guild scan + sort on every trend/recent call.

- idx_conversations_content_trgm (GIN, pg_trgm)
  content ILIKE '%...%' search was a sequential scan. GIN/trgm index
  reduces this from O(n) to O(log n * trigram matches).
  Requires pg_trgm extension (added idempotently).

- idx_conversations_guild_created (guild_id, created_at DESC)
  Default 30-day listing query filters guild_id + created_at. The
  existing 3-column (guild_id, channel_id, created_at) composite is
  suboptimal when channel_id is not in the predicate.

- idx_flagged_messages_guild_message (guild_id, message_id)
  Conversation detail + flag endpoints query flagged_messages by
  guild_id AND message_id = ANY(...). Existing index only covers
  (guild_id, status).

### src/api/routes/conversations.js
**GET / — Replace in-memory pagination with SQL CTE grouping**

Before: fetched up to 10,000 message rows into Node memory, grouped
them in JavaScript (O(n) time + memory), then sliced for pagination.
Every page request loaded the full 10k row dataset.

After: single SQL query using window functions (LAG + SUM OVER) to
identify conversation boundaries and aggregate summaries directly.
COUNT(*) OVER() provides total count without a second query.
Pagination happens at the DB with LIMIT/OFFSET on summary rows.
Memory overhead is now proportional to page size (default 25), not
total conversation volume.

Removed now-unused buildConversationSummary() helper (logic inlined
into the SQL-side aggregation).

**POST /:conversationId/flag — Parallel verification queries**

Before: msgCheck and anchorCheck ran sequentially (~2× RTT).
After: both run in parallel via Promise.all (1× RTT for verification).

### tests/api/routes/conversations.test.js
Updated 'should return paginated conversations' test to mock the new
SQL CTE response shape (pre-aggregated summary rows) instead of raw
message rows. All 41 conversation tests pass.

* feat: channel-level quiet mode via bot mention (#173) (#213)

* feat: quiet mode per-channel via bot mention (#173)

- Add quietMode.js module with Redis+memory storage
- Parse duration from natural language (30m, 1 hour, etc.)
- Permission gated via config.quietMode.allowedRoles
- Commands: quiet, unquiet, status
- Suppress AI responses during quiet mode in events.js
- Add quietMode section to config.json (disabled by default)
- Add quietMode to configAllowlist.js for dashboard editing

* test: add quiet mode tests (41 tests, all passing)

* style: fix biome formatting in quietMode.js, events.js, and test

* fix(web): fix ai-feedback-stats TypeScript and formatting errors

* fix: gate quiet mode checks on enabled flag, validate TTL, honor maxDurationMinutes config

- events.js: Wrap isQuietMode() calls in guildConfig.quietMode?.enabled check
  to avoid unnecessary Redis lookups and prevent stale records from suppressing
  AI responses when the feature is disabled (PRRT_kwDORICdSM5xdbmp, PRRT_kwDORICdSM5xdbmx)

- quietMode.js: Add TTL validation in setQuiet() to guard against 0, negative,
  or NaN values that would error in Redis (PRRT_kwDORICdSM5xdbm3)

- quietMode.js: Update parseDurationFromContent() to accept config parameter
  and honor guildConfig.quietMode.maxDurationMinutes. Also clamp defaultSeconds
  to the effective max (PRRT_kwDORICdSM5xdbm_)

- configValidation.js: Add quietMode schema entry with enabled, maxDurationMinutes,
  and allowedRoles properties (PRRT_kwDORICdSM5xdbnH)

* style: fix biome formatting in quietMode.js and ai-feedback-stats.tsx

* feat: audit log improvements — CSV/JSON export and real-time WebSocket stream (#215)

* feat: audit log improvements — CSV/JSON export, real-time WebSocket stream

- Add GET /:id/audit-log/export endpoint (CSV and JSON, up to 10k rows)
- Add /ws/audit-log WebSocket server for real-time audit entry broadcast
- Refactor buildFilters() shared helper to eliminate duplication
- Hook broadcastAuditEntry() into insertAuditEntry (RETURNING id+created_at)
- Wire setupAuditStream/stopAuditStream into startServer/stopServer lifecycle
- Add escapeCsvValue/rowsToCsv helpers with full test coverage
- 30 route tests + 17 WebSocket stream tests, all green

Closes #136

* fix: PR #215 review feedback - audit stream fixes

- ws.ping() crash: guard with readyState check + try/catch to avoid
  crashing heartbeat interval when socket not OPEN
- stopAuditStream race: make setupAuditStream async and await
  stopAuditStream() to prevent concurrent WebSocketServer creation
- Query param array coercion: add typeof === 'string' checks for
  startDate/endDate to handle Express string|string[]|undefined
- CSV CRLF quoting: add \r to RFC 4180 special-char check for proper
  Windows line ending handling
- Test timeouts: make AUTH_TIMEOUT_MS configurable via
  AUDIT_STREAM_AUTH_TIMEOUT_MS env var, use 100ms in tests

* feat: voice channel activity tracking — join/leave/move, leaderboard, export (#212)

* feat: add voice_sessions migration (#135)

* feat: add voice tracking module — join/leave/move/flush/leaderboard (#135)

* feat: wire voiceStateUpdate handler into event registration (#135)

* feat: add /voice command — leaderboard, stats, export subcommands (#135)

* feat: add voice config defaults to config.json (#135)

* feat: wire voice flush start/stop into bot lifecycle (#135)

* feat: add voice to config API allowlist (#135)

* fix: SQL UPDATE subquery for closeSession, fix import order (#135)

* fix(voice): resolve race conditions and missing config schema

- Fix openSession: update in-memory state only AFTER DB INSERT succeeds
- Fix closeSession: delete from in-memory state only AFTER DB UPDATE succeeds
- Fix: allow closeSession on leave/move even when feature is disabled
- Fix migration: add UNIQUE constraint to partial index to prevent duplicates
- Fix: move 'Voice join' log to after openSession succeeds
- Add voice config to CONFIG_SCHEMA for validation

---------

Co-authored-by: Bill <[email protected]>

* feat(dashboard): auto-save config with 500ms debounce (#199)

* feat(dashboard): replace manual save with auto-save (500ms debounce)

- Remove 'Save Changes' button; saving now fires automatically 500ms
  after the last config change (no changes → no network call)
- Add saveStatus state ('idle' | 'saving' | 'saved' | 'error') with
  AutoSaveStatus component showing spinner, check, or error+retry
- Add isLoadingConfigRef guard so the initial fetchConfig load never
  triggers a spurious PATCH
- Ctrl+S still works: clears debounce timer and saves immediately
- Keep 'beforeunload' warning for validation errors that block save
- Replace yellow unsaved-changes banner with a destructive validation
  error banner (only shown when save is actually blocked)
- Error state shows 'Save failed' + 'Retry' button for user recovery

Closes #189

* test(dashboard): add auto-save tests for ConfigEditor

- No PATCH on initial config load
- Validation error banner suppresses auto-save
- 'Saving...' spinner visible while PATCH in-flight
- 'Save failed' + Retry button on PATCH error

* fix(dashboard): prevent fetchConfig from overwriting saveStatus after successful save

Add skipSaveStatusReset parameter to fetchConfig so that post-save reloads
preserve the 'saved' status indicator instead of immediately resetting to 'idle'.

* test(dashboard): use fake timers, restore vi.stubGlobal, fix assertions, add idle/saved coverage

- Replace real setTimeout delays with vi.useFakeTimers() + vi.advanceTimersByTimeAsync()
  for deterministic, fast debounce tests
- Add afterEach cleanup: vi.unstubAllGlobals() + vi.useRealTimers()
- Replace toBeTruthy() with toBeInTheDocument() for Testing Library queries
- Add idle state test (no status indicator shown after load)
- Add saved state test (shows 'Saved' after successful save)
- Update file-level comment to list all four states

---------

Co-authored-by: Bill Chirico <[email protected]>

* feat: Reaction role menus (#162) (#205)

* feat: reaction role menus - core module, command, event hooks, migration

Implements issue #162: reaction role menus.

- Add migration 004 creating reaction_role_menus and reaction_role_entries tables
- Add src/modules/reactionRoles.js with DB helpers, embed builder, event handlers
- Add src/commands/reactionrole.js with /reactionrole create|add|remove|delete|list
- Wire handleReactionRoleAdd/Remove into registerReactionHandlers in events.js

Roles are granted on reaction add and revoked on reaction remove.
All mappings persist in PostgreSQL across bot restarts.

* test: reaction role menus - 40 tests covering module and command

- tests/modules/reactionRoles.test.js: resolveEmojiString, buildReactionRoleEmbed,
  all DB helpers, handleReactionRoleAdd, handleReactionRoleRemove
- tests/commands/reactionrole.test.js: all 5 subcommands (create, add, remove,
  delete, list) including error paths and guild ownership checks
- Fix biome lint: import sort order + unused import removal

* fix: remove unused import in reactionrole command

---------

Co-authored-by: Bill Chirico <[email protected]>

* fix(security): validate GitHub owner/repo format before gh CLI call (#198)

* fix(security): validate GitHub owner/repo format before gh CLI call

Prevents API path traversal by validating owner/repo segments against
a strict allowlist regex before interpolating them into the gh CLI
invocation.

Adds:
- VALID_GH_NAME regex (/^[a-zA-Z0-9._-]+$/)
- isValidGhRepo() helper (exported for testing)
- Guard in fetchRepoEvents() — returns [] and warns on invalid input
- Strengthened guard in pollGuildFeed() split logic

Fixes #160

* test(security): add validation tests for GitHub owner/repo format

Covers isValidGhRepo(), VALID_GH_NAME regex, and fetchRepoEvents()
validation guard introduced in fix for #160.

19 new tests verify:
- Valid alphanumeric/dot/hyphen/underscore names pass
- Path traversal (../../etc/passwd) is rejected at both entry points
- Slashes, empty strings, non-strings, spaces all rejected
- Shell metacharacters (; && $()) blocked
- gh CLI is NOT invoked when validation fails
- warn() fires with the invalid values (observable audit trail)
- Valid owner/repo still reach gh CLI unchanged

* fix(security): reject pure-dot owner/repo names to prevent path traversal

* test(githubFeed): add tests for pure-dot path traversal bypass

---------

Co-authored-by: Bill Chirico <[email protected]>

---------

Co-authored-by: Bill <[email protected]>
Co-authored-by: Bill Chirico <[email protected]>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +39 to +40
owner.length > 0 &&
repo.length > 0 &&
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The owner.length > 0 and repo.length > 0 checks are redundant because VALID_GH_NAME (/^[a-zA-Z0-9][a-zA-Z0-9._-]*$/) already requires at least one character and returns false for empty strings. The length guards can be safely removed to simplify the condition.

Copilot uses AI. Check for mistakes.
Comment on lines +771 to +775
it('returns [] and warns for path traversal in repo', async () => {
const result = await fetchRepoEvents('owner', '../../users/admin');
expect(result).toEqual([]);
expect(execFile).not.toHaveBeenCalled();
});
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test name says "returns [] and warns for path traversal in repo" but the test body does not verify that warn was called (unlike the analogous test for path traversal in owner at line 761, which asserts expect(warn).toHaveBeenCalledWith(...)). This makes the test name misleading — either the assertion for warn should be added to make it consistent with the test at line 761, or the test name should be changed to remove "and warns".

Copilot generated this review using guidance from organization custom instructions.
*
* @see https://github.com/VolvoxLLC/volvox-bot/issues/160
*/
export const VALID_GH_NAME = /^[a-zA-Z0-9][a-zA-Z0-9._-]*$/;
Copy link

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description states the VALID_GH_NAME regex is /^[a-zA-Z0-9._-]+$/, but the actual implementation at line 26 is /^[a-zA-Z0-9][a-zA-Z0-9._-]*$/. The implementation is stronger (and correct, as it requires the first character to be alphanumeric, which the PR description's regex would not), but the PR description is inaccurate and could cause confusion when reviewing the change intent.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

security: validate GitHub feed owner/repo format before CLI execution

2 participants