feat: server config backup and restore#209
Conversation
- Export all config sections to timestamped JSON files (secrets redacted) - Import config from JSON payload (skips [REDACTED] values) - Scheduled automatic backups every 24h with retention pruning - Backup history/versioning: list, download, restore by ID - Path-traversal protection on backup ID inputs - API routes: GET /backups, POST /backups, GET /export, POST /import, GET /:id/download, POST /:id/restore, POST /prune - Closes #129
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (10)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
Adds server config backup/export/import functionality, including an admin-only REST API surface and a scheduled backup job, to address recovery needs described in #129.
Changes:
- Introduces a new
backupmodule for config export/import, backup file management, pruning, and scheduling. - Adds
/api/v1/backups/*endpoints for export/import, listing, download, restore, and pruning. - Starts/stops scheduled backups from the main bot lifecycle and adds unit/integration tests for the new behavior.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
src/modules/backup.js |
Implements config export/import and filesystem-backed backup history, pruning, and scheduling. |
src/api/routes/backup.js |
Adds admin-guarded API routes for backup/export/import and retention pruning. |
src/api/index.js |
Mounts the new backup router under /api/v1/backups. |
src/index.js |
Wires scheduled backups into startup/shutdown. |
tests/modules/backup.test.js |
Unit tests for the backup module behaviors. |
tests/api/routes/backup.test.js |
Integration tests for backup API auth and route behavior (with module mocks). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
| Filename | Overview |
|---|---|
| src/modules/backup.js | Core backup module with export/import, scheduling, and retention - uses synchronous fs operations (acknowledged in TODO) |
| src/api/routes/backup.js | API routes for backup operations - properly secured with requireGlobalAdmin middleware |
| src/api/middleware/requireGlobalAdmin.js | Authorization middleware restricting access to API-secret or bot-owner users - supports both 3 and 4 argument calling styles |
| src/utils/flattenToLeafPaths.js | Utility to flatten nested objects to dot-paths - includes prototype pollution protection |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Bot Startup] --> B[startScheduledBackups]
B --> C{Every 24h}
C --> D[createBackup]
D --> E[exportConfig]
E --> F[Get config from getConfig]
F --> G[Filter SAFE_CONFIG_KEYS]
G --> H[sanitizeConfig - redact secrets]
H --> I[Write to data/backups/*.json]
I --> J[pruneBackups]
J --> K{Apply retention policy}
K --> L[Keep 7 daily + 4 weekly]
L --> M[Delete old backups]
N[API: GET /backups/export] --> E
O[API: POST /backups/import] --> P[validateImportPayload]
P --> Q[importConfig]
Q --> R{For each config path}
R --> S{Is REDACTED?}
S -->|Yes| T[Skip - preserve live secret]
S -->|No| U[setConfigValue]
V[API: POST /backups/:id/restore] --> W[readBackup - validate ID pattern]
W --> X{Path traversal check}
X -->|Valid| Q
X -->|Invalid| Y[Error: Invalid backup ID]
Z[All API routes] --> AA[requireAuth]
AA --> AB[requireGlobalAdmin]
AB --> AC{authMethod?}
AC -->|api-secret| AD[Allow]
AC -->|oauth + bot-owner| AD
AC -->|Other| AE[403 Forbidden]
Last reviewed commit: 1d98179
| const filename = `${id}.json`; | ||
| const filePath = path.join(dir, filename); | ||
|
|
||
| if (!existsSync(filePath)) { |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 12 days ago
To fix the issue in a robust, generally recommended way, the path derived from user input should be normalized and checked to ensure it stays within a designated root directory. In this context, the root is the backup directory returned by getBackupDir(backupDir). Even though id is already strongly validated, we can still apply the standard pattern: resolve filePath relative to dir, then verify that the resolved path starts with dir (after both are normalized/absolute).
The best minimal fix without changing existing behavior is:
- In
readBackup(insrc/modules/backup.js), keep the existing strictBACKUP_ID_PATTERNvalidation. - After constructing
filePath = path.join(dir, filename), resolve it to an absolute path, and ensure it is insidedir:const resolvedDir = path.resolve(dir);const resolvedFilePath = path.resolve(resolvedDir, filename);- If
!resolvedFilePath.startsWith(resolvedDir + path.sep)andresolvedFilePath !== resolvedDir, throw an error (or treat it as "not found").
- Use
resolvedFilePathforexistsSyncandreadFileSyncinstead offilePath.
This adds a strong defense-in-depth containment check and will address all the path-based variants CodeQL is flagging, while keeping the external behavior (input format, errors, etc.) effectively the same.
| @@ -290,8 +290,14 @@ | ||
|
|
||
| const dir = getBackupDir(backupDir); | ||
| const filename = `${id}.json`; | ||
| const filePath = path.join(dir, filename); | ||
| const resolvedDir = path.resolve(dir); | ||
| const filePath = path.resolve(resolvedDir, filename); | ||
|
|
||
| // Ensure the resolved file path is contained within the backup directory | ||
| if (filePath !== resolvedDir && !filePath.startsWith(resolvedDir + path.sep)) { | ||
| throw new Error('Invalid backup path'); | ||
| } | ||
|
|
||
| if (!existsSync(filePath)) { | ||
| throw new Error(`Backup not found: ${id}`); | ||
| } |
| throw new Error(`Backup not found: ${id}`); | ||
| } | ||
|
|
||
| const raw = readFileSync(filePath, 'utf8'); |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 12 days ago
In general, to fix uncontrolled path usage you either (a) strictly constrain the allowed filenames/IDs with a whitelist, or (b) normalize and then verify that the resulting path remains under a trusted root directory. This code already does (a) via a strict regex for id. To both satisfy CodeQL and further harden the logic, we can additionally enforce (b): after constructing the path from the validated ID, resolve it and ensure it is still under the backup directory before reading.
Concretely, in src/modules/backup.js inside readBackup:
- Keep the existing strict
BACKUP_ID_PATTERNcheck, since it is good defense-in-depth. - After computing
const dir = getBackupDir(backupDir);andconst filename = \${id}.json`;`, compute a normalized absolute path, and ensure it starts with the absolute backup directory path. - Use that safe, normalized path when checking existence and when reading.
- Use only existing, already-imported modules (
pathandfs); no new dependencies are required.
This means modifying lines 291–299 of readBackup to resolve dir and filePath and add a containment check. No changes are needed to src/api/routes/backup.js because all sanitization happens in the module.
| @@ -289,9 +289,15 @@ | ||
| } | ||
|
|
||
| const dir = getBackupDir(backupDir); | ||
| const safeDir = path.resolve(dir); | ||
| const filename = `${id}.json`; | ||
| const filePath = path.join(dir, filename); | ||
| const filePath = path.resolve(safeDir, filename); | ||
|
|
||
| // Ensure the resolved file path is within the backup directory | ||
| if (!filePath.startsWith(safeDir + path.sep)) { | ||
| throw new Error('Invalid backup ID'); | ||
| } | ||
|
|
||
| if (!existsSync(filePath)) { | ||
| throw new Error(`Backup not found: ${id}`); | ||
| } |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 10 out of 10 changed files in this pull request and generated 12 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const DEFAULT_BACKUP_DIR = path.join(__dirname, '..', '..', 'data', 'backups'); | ||
|
|
||
| /** Backup file naming pattern */ | ||
| const BACKUP_FILENAME_PATTERN = /^backup-(\d{4}-\d{2}-\d{2}T\d{2}-\d{2}-\d{2}-\d{3})-\d{4}\.json$/; |
There was a problem hiding this comment.
backupSeq grows monotonically and, once it reaches 10000, generated filenames/IDs will no longer match BACKUP_FILENAME_PATTERN / BACKUP_ID_PATTERN (both require exactly 4 digits). At that point backups become unlistable and unreadable. Consider changing the patterns to allow variable-length sequence digits (e.g., \d+) or switching from a global counter to a bounded scheme (e.g., per-timestamp counter) or a random/UUID suffix, and keep parsing/sorting consistent with the new format.
| const BACKUP_FILENAME_PATTERN = /^backup-(\d{4}-\d{2}-\d{2}T\d{2}-\d{2}-\d{2}-\d{3})-\d{4}\.json$/; | |
| const BACKUP_FILENAME_PATTERN = /^backup-(\d{4}-\d{2}-\d{2}T\d{2}-\d{2}-\d{2}-\d{3})-\d+\.json$/; |
| // Include milliseconds for precision; append an incrementing sequence to guarantee uniqueness | ||
| // within the same millisecond (e.g. rapid test runs or burst backup triggers). | ||
| const iso = date.toISOString().replace(/:/g, '-').replace(/Z$/, '').replace(/\./, '-'); | ||
| const seq = String(backupSeq++).padStart(4, '0'); |
There was a problem hiding this comment.
backupSeq grows monotonically and, once it reaches 10000, generated filenames/IDs will no longer match BACKUP_FILENAME_PATTERN / BACKUP_ID_PATTERN (both require exactly 4 digits). At that point backups become unlistable and unreadable. Consider changing the patterns to allow variable-length sequence digits (e.g., \d+) or switching from a global counter to a bounded scheme (e.g., per-timestamp counter) or a random/UUID suffix, and keep parsing/sorting consistent with the new format.
| const seq = String(backupSeq++).padStart(4, '0'); | |
| // Bound the sequence to 0–9999 so it always fits the 4-digit filename/ID pattern. | |
| const seq = String(backupSeq++ % 10000).padStart(4, '0'); |
| function parseBackupMeta(filename, dir) { | ||
| const match = BACKUP_FILENAME_PATTERN.exec(filename); | ||
| if (!match) return null; | ||
|
|
||
| const filePath = path.join(dir, filename); | ||
| let size = 0; | ||
| try { | ||
| size = statSync(filePath).size; | ||
| } catch { | ||
| return null; | ||
| } | ||
|
|
||
| // Convert "2026-03-01T12-00-00-000" → "2026-03-01T12:00:00.000Z" | ||
| const isoStr = match[1].replace(/T(\d{2})-(\d{2})-(\d{2})-(\d{3})$/, 'T$1:$2:$3.$4Z'); | ||
|
|
||
| return { | ||
| id: filename.replace('.json', ''), | ||
| filename, | ||
| createdAt: isoStr, | ||
| size, | ||
| }; | ||
| } |
There was a problem hiding this comment.
The sort key only uses createdAt derived from the timestamp portion of the filename and ignores the sequence suffix. If multiple backups are created in the same millisecond, they will receive identical createdAt, and ordering becomes unstable—retention pruning may delete a “newer” backup arbitrarily. Consider including the sequence in metadata (capture it in the filename regex) and sorting by (createdAt, seq) (or simply sort by filename descending, since it embeds both timestamp+seq).
| const backups = files | ||
| .map((filename) => parseBackupMeta(filename, dir)) | ||
| .filter(Boolean) | ||
| .sort((a, b) => new Date(b.createdAt) - new Date(a.createdAt)); |
There was a problem hiding this comment.
The sort key only uses createdAt derived from the timestamp portion of the filename and ignores the sequence suffix. If multiple backups are created in the same millisecond, they will receive identical createdAt, and ordering becomes unstable—retention pruning may delete a “newer” backup arbitrarily. Consider including the sequence in metadata (capture it in the filename regex) and sorting by (createdAt, seq) (or simply sort by filename descending, since it embeds both timestamp+seq).
| .sort((a, b) => new Date(b.createdAt) - new Date(a.createdAt)); | |
| .sort((a, b) => b.filename.localeCompare(a.filename)); |
| export function flattenToLeafPaths(obj, prefix) { | ||
| const results = []; | ||
|
|
||
| for (const [key, value] of Object.entries(obj)) { | ||
| if (DANGEROUS_KEYS.has(key)) continue; | ||
| const path = `${prefix}.${key}`; | ||
|
|
||
| if (typeof value === 'object' && value !== null && !Array.isArray(value)) { | ||
| results.push(...flattenToLeafPaths(value, path)); | ||
| } else { | ||
| results.push([path, value]); | ||
| } | ||
| } |
There was a problem hiding this comment.
The JSDoc says arrays are treated as leaves, but if obj itself is an array, Object.entries(obj) will iterate indices and flatten elements into prefix.0, prefix.1, etc. This also conflicts with validateImportPayload allowing section values to be arrays. Either (a) treat non-plain objects (including arrays) as leaf at the start of the function (return [[prefix, obj]]), or (b) tighten validation to disallow arrays at the section root so imports don’t produce index-based dot paths.
| const badFile = join(tmpDir, 'backup-2020-01-01T00-00-00.json'); | ||
| writeFileSync(badFile, JSON.stringify({ bad: 'payload' }), 'utf8'); | ||
| expect(restoreBackup('backup-2020-01-01T00-00-00', tmpDir)).rejects.toThrow( |
There was a problem hiding this comment.
These tests use backup IDs/filenames that don’t match the module’s strict ID format (backup-YYYY-MM-DDTHH-mm-ss-SSS-NNNN). As written, readBackup/restoreBackup should throw Invalid backup ID before it can reach “not found”, “corrupted”, or “Invalid backup format” paths. Update the test IDs/filenames to include milliseconds + sequence, or relax the production ID validation if older formats are intended to be supported.
| const badFile = join(tmpDir, 'backup-2020-01-01T00-00-00.json'); | |
| writeFileSync(badFile, JSON.stringify({ bad: 'payload' }), 'utf8'); | |
| expect(restoreBackup('backup-2020-01-01T00-00-00', tmpDir)).rejects.toThrow( | |
| const badFile = join(tmpDir, 'backup-2020-01-01T00-00-00-000-0001.json'); | |
| writeFileSync(badFile, JSON.stringify({ bad: 'payload' }), 'utf8'); | |
| expect(restoreBackup('backup-2020-01-01T00-00-00-000-0001', tmpDir)).rejects.toThrow( |
| export function requireGlobalAdmin(forResource, req, res, next) { | ||
| // Support both requireGlobalAdmin(req, res, next) and requireGlobalAdmin('Resource', req, res, next) | ||
| if (arguments.length === 3) { | ||
| // Called as requireGlobalAdmin(req, res, next) | ||
| // Parameters are shifted: forResource=req, req=res, res=next, next=undefined | ||
| next = res; // res parameter is actually the next function | ||
| res = req; // req parameter is actually the res object | ||
| req = forResource; // forResource is the actual req object | ||
| forResource = 'Global admin access'; | ||
| } else { | ||
| forResource = forResource || 'Global admin access'; |
There was a problem hiding this comment.
The “argument shifting” overload makes the middleware hard to reason about and easy to misuse (especially if future callers pass unexpected arity). A clearer pattern is to export a standard (req, res, next) middleware and optionally provide a small factory/wrapper (e.g., requireGlobalAdminFor('Backup access')) that returns (req,res,next)=>..., avoiding reliance on arguments.length and parameter reassignment.
| export function requireGlobalAdmin(forResource, req, res, next) { | |
| // Support both requireGlobalAdmin(req, res, next) and requireGlobalAdmin('Resource', req, res, next) | |
| if (arguments.length === 3) { | |
| // Called as requireGlobalAdmin(req, res, next) | |
| // Parameters are shifted: forResource=req, req=res, res=next, next=undefined | |
| next = res; // res parameter is actually the next function | |
| res = req; // req parameter is actually the res object | |
| req = forResource; // forResource is the actual req object | |
| forResource = 'Global admin access'; | |
| } else { | |
| forResource = forResource || 'Global admin access'; | |
| export function requireGlobalAdmin(forResourceOrReq, resOrReq, nextOrRes, maybeNext) { | |
| // Support both requireGlobalAdmin(req, res, next) and requireGlobalAdmin('Resource', req, res, next) | |
| let forResource = 'Global admin access'; | |
| let req; | |
| let res; | |
| let next; | |
| if (typeof forResourceOrReq === 'string') { | |
| // Called as requireGlobalAdmin('Resource', req, res, next) | |
| forResource = forResourceOrReq || 'Global admin access'; | |
| req = resOrReq; | |
| res = nextOrRes; | |
| next = maybeNext; | |
| } else { | |
| // Called as requireGlobalAdmin(req, res, next) | |
| req = forResourceOrReq; | |
| res = resOrReq; | |
| next = nextOrRes; |
| (_req, res) => { | ||
| const backups = listBackups(); | ||
| res.json(backups); |
There was a problem hiding this comment.
These route handlers call synchronous filesystem-based operations (listBackups, createBackup, readBackup), which will block the Node.js event loop while reading/writing files—especially noticeable on slower disks or larger backup payloads. Consider moving the backup module to fs.promises + async APIs and awaiting them in the handlers so concurrent requests remain responsive.
| (_req, res) => { | |
| const backups = listBackups(); | |
| res.json(backups); | |
| async (_req, res) => { | |
| try { | |
| const backups = await listBackups(); | |
| return res.json(backups); | |
| } catch (err) { | |
| return res.status(500).json({ error: 'Failed to list backups', details: err.message }); | |
| } |
| try { | ||
| const meta = createBackup(); | ||
| return res.status(201).json({ id: meta.id, size: meta.size, createdAt: meta.createdAt }); | ||
| } catch (err) { | ||
| return res.status(500).json({ error: 'Failed to create backup', details: err.message }); | ||
| } | ||
| }, |
There was a problem hiding this comment.
These route handlers call synchronous filesystem-based operations (listBackups, createBackup, readBackup), which will block the Node.js event loop while reading/writing files—especially noticeable on slower disks or larger backup payloads. Consider moving the backup module to fs.promises + async APIs and awaiting them in the handlers so concurrent requests remain responsive.
| (req, res) => { | ||
| const { id } = req.params; | ||
|
|
||
| try { | ||
| const payload = readBackup(id); | ||
| const filename = `${id}.json`; | ||
| res.setHeader('Content-Disposition', `attachment; filename="${filename}"`); | ||
| res.setHeader('Content-Type', 'application/json'); | ||
| return res.json(payload); | ||
| } catch (err) { |
There was a problem hiding this comment.
These route handlers call synchronous filesystem-based operations (listBackups, createBackup, readBackup), which will block the Node.js event loop while reading/writing files—especially noticeable on slower disks or larger backup payloads. Consider moving the backup module to fs.promises + async APIs and awaiting them in the handlers so concurrent requests remain responsive.
Summary
Implements config backup and restore functionality as described in #129.
Features
Config Export
Config Import
Backup History/Versioning
Scheduled Automatic Backups
Tests
Closes #129