Add Database Target and SSL Debug Logs for Scheduler Startup #3361 by mkf-sysangels · Pull Request #3362 · bunkerity/bunkerweb

mkf-sysangels · 2026-03-19T09:22:48Z

Add Database Target and SSL Debug Logs for Scheduler Startup #3361

…y#3361 Add Database Target and SSL Debug Logs for Scheduler Startup bunkerity#3361

mkf-sysangels · 2026-03-19T09:30:34Z

Single db-tls-validate.sh helper script so we can call it from linux scheduler and docker

mkf-sysangels · 2026-03-19T16:02:32Z

Hello @TheophileDiot

the PR is ready for code audit and public preview

TheophileDiot · 2026-04-10T08:08:28Z

@coderabbitai full review

coderabbitai · 2026-04-10T08:08:34Z

✅ Actions performed

Full review triggered.

coderabbitai · 2026-04-10T08:08:46Z

Summary by Component

src/common

Database.py: Enhanced connection failure logging with URI masking and SSL/TLS diagnostics. Added _mask_db_uri() helper to redact passwords before logging DATABASE_URI, preventing credential leaks. Introduced SSL/TLS detection (_ssl_expectations()) to identify when SSL is enabled but no root CA is configured, emitting targeted diagnostic messages during initial connection retries. Connection retry and fatal error logs now include masked URI instead of raw exception strings. (+83 lines)

db-tls-validate.sh: New shared shell helper script providing database URI parsing and TLS certificate validation utilities for both Linux scheduler and Docker environments. Exports three functions:

parse_database_uri(): Parses DATABASE_URI to extract type, database, host, port, user, ssl_enabled, and pem_path
validate_db_tls(): Conditional TLS validation with automatic system CA bundle detection
validate_db_tls_certificate(): Performs certificate and connection validation via openssl s_client, supporting protocol-specific STARTTLS modes for PostgreSQL and MySQL/MariaDB
(+337 lines)

src/linux

bunkerweb-scheduler.sh: Added early DATABASE_URI parsing via new helper, deriving db target fields with fallback defaults on parse failure. Introduced conditional DB TLS validation for non-sqlite targets (continues execution if validation fails). Added logging of parsed scheduler DB target details. (+29 lines)

src/scheduler

entrypoint.sh: Mirrors Linux scheduler changes—sources db-tls-validate.sh, parses DATABASE_URI, performs pre-connection TLS validation for non-sqlite targets, and logs parsed DB target fields. (+29 lines)

main.py: Added _db_target_details() helper function that parses SQLAlchemy URL to log formatted "type/database/host/user" descriptor on startup; falls back to "unknown/none/local" on parsing failure. Refactored to store _db_uri separately for clarity. (+22 lines)

Impact Assessment

Security: Improves SSL/TLS diagnostics at scheduler startup; URI masking prevents credential exposure in logs.

User-visible Changes: Enhanced startup logging now reports parsed database target details and TLS validation outcomes; no functional behaviour changes.

Deployment: Adds shell helper script; relies on standard OpenSSL, Python URL parsing, and system CA bundle paths.

Configuration/Schema: None.

Documentation/Tests: Not addressed in this PR.

Walkthrough

Database URI masking and SSL/TLS diagnostics are now integrated across database initialisation pathways. A new shell utility provides URI parsing and certificate validation, called during scheduler startup for pre-connection verification of database TLS configurations.

Changes

Cohort / File(s)	Summary
Database URI masking and SSL diagnostics `src/common/db/Database.py`	Added `_mask_db_uri()` function with SQLAlchemy fallback to redact passwords from error logs. Implemented `_ssl_expectations()` helper to detect SSL enablement and root CA configuration. Enhanced connection retry loop with specialised TLS-related diagnostics and masked URI inclusion in timeout, retry, and fatal error messages.
TLS validation helper script `src/common/helpers/db-tls-validate.sh`	New shared utility defining three functions: `parse_database_uri` (parses `DATABASE_URI` to extract `type`, `database`, `host`, `port`, `user`, `ssl_enabled`, `pem_path`), `validate_db_tls` (conditionally validates TLS configuration with system CA bundle fallback), and `validate_db_tls_certificate` (performs OpenSSL connection and X.509 validation checks).
Scheduler and entrypoint TLS integration `src/linux/scripts/bunkerweb-scheduler.sh`, `src/scheduler/entrypoint.sh`	Each script sources `db-tls-validate.sh`, parses `DATABASE_URI`, sets `db_ssl_enabled` and `db_pem_path` variables, invokes pre-connection TLS validation for non-SQLite targets, and logs parsed database target details.
Scheduler main refactoring `src/scheduler/main.py`	Extracted `_db_uri` into a local variable within `backup_failover()`. Added `_db_target_details(uri)` helper to format and log database connection target as "type/database/host/user" string, defaulting to "unknown/none/local" on parsing failure.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~55 minutes

Poem

🔐 URIs masked with care and grace,
SSL whispers in their proper place,
Certificates dance through the shell,
Scheduler logs what's safe to tell,
Security checks before the bell! 🔔

🚥 Pre-merge checks | ✅ 2

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarises the main changeset objective: adding database target and SSL debug logging for scheduler startup, with the issue reference included.
Description check	✅ Passed	The description directly mirrors the title and relates to the changeset's objective of adding database target and SSL debug logging, though it lacks implementation detail.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 8

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/common/db/Database.py`:
- Around line 352-357: The error logs in Database (self.logger.error calls that
currently include DATABASE_URI) must stop emitting the full connection URI and
must not prefer self.database_uri over the readonly URI when reporting which
target failed; update each logger.error (the ones after retry failures using
DATABASE_RETRY_TIMEOUT and in the dual-endpoint failure paths) to omit the URI
and instead log a non-secret target label plus the backend name (use a clear
label like "primary" or "readonly" derived from whether the attempt was against
self.database_uri or self.database_uri_readonly, and include
self.database_backend or similar backend identifier). Ensure the message uses
DATABASE_RETRY_TIMEOUT and the caught exception e, but replace any
_mask_db_uri(...) or direct URI interpolation with the non-secret target label
and backend name.
- Around line 333-341: The TLS hint logs inside the retry loop currently use
self.logger.error and should be lowered to avoid false-positive alerts; change
the two self.logger.error calls in the retrying connection path (the block that
logs "Database connection failed with SSL/TLS enabled..." and the subsequent
instruction about setting ssl_ca/sslrootcert) to self.logger.warning or
self.logger.debug (prefer warning for actionable hints, debug if very verbose)
and ensure only the final failure after exhausting retries still logs as error;
locate these calls in the Database class connection/retry method (the method
that performs the retry loop and currently calls self.logger.error with the
exception variable e) and update them accordingly.
- Around line 214-223: The helper functions _mask_db_uri and _ssl_expectations
currently catch broad Exception around sqlalchemy.engine.url.make_url(uri);
change those handlers to catch SQLAlchemy's specific ArgumentError instead
(import ArgumentError from sqlalchemy.exc) so only invalid-URI parsing falls
back to the existing masking logic and other unexpected errors propagate; keep
the existing fallback masking behavior unchanged.

In `@src/common/helpers/db-tls-validate.sh`:
- Around line 76-82: The printed assignments for variables (db_type, db_name,
host, port, user, ssl_enabled, pem_path) are emitted raw but are later consumed
via eval; update the print statements to emit shell-escaped values so untrusted
content cannot inject commands—use a shell-quoting helper such as Python's
shlex.quote (or an equivalent escape routine) to wrap each value before printing
(e.g., replace the current print(...) calls with prints that output
safely-quoted assignments using shlex.quote on db_type, db_name, host, port,
user, ssl_enabled, and pem_path).
- Around line 203-224: The Python expiry-check block (started in
cert_check=$(... python3 - << 'PY') only calls
x509.load_pem_x509_certificates(data) which fails for DER input; update the code
inside that heredoc to first try loading PEM via
x509.load_pem_x509_certificates(Path(pem_file).read_bytes()) and on ValueError
fall back to loading DER (use x509.load_der_x509_certificate for a single cert
or decode multiple DER certs if needed), then proceed to compute expiries as
before; ensure exceptions are still caught and that the PEM_FILE/ pem_path
variable names (PEM_FILE, pem_path) and the cert_check variable remain unchanged
so the surrounding shell logic continues to work.
- Around line 273-281: The openssl invocations build unsafe unquoted command
strings and do not handle IPv6 literals; update the code that constructs
direct_cmd and the later ip-based command to (1) add a helper (e.g.,
format_connect_target) that accepts host/ip and port and returns a properly
bracketed target for IPv6 like "[2001:db8::1]:5432" and plain "host:port" for
IPv4/hostnames, (2) stop concatenating into a single command string and instead
build an array for the command (e.g., cmd=(openssl s_client -connect "$target")
and append -CAfile "$pem_path" and -starttls options as separate array
elements), and (3) use the array form with timeout (or exec) so variables are
quoted and paths with spaces are safe; apply this change for both the direct_cmd
construction around direct_cmd/direc_cmd usage and the ip-based openssl
invocation in the later block. Ensure references: direct_cmd, $host/$ip/$port,
and the -starttls branches (postgres, mysql|mariadb) are updated to use the
helper and array-based command execution.

In `@src/linux/scripts/bunkerweb-scheduler.sh`:
- Around line 141-154: The parse_database_uri fallback branch currently leaves
port unset and does not re-run parsing after DATABASE_URI is replaced later;
update the error path in parse_database_uri handling so it initializes port
(e.g., set port="(unknown)" or a default) and ensure that after any code that
replaces DATABASE_URI (the block that updates DATABASE_URI around the later
lines) you call parse_database_uri again (or re-eval parse_output) so variables
produced by parse_database_uri—type, database, host, user, ssl_enabled,
pem_path, port—are refreshed before assigning db_ssl_enabled/db_pem_path/db_port
and before logging; locate the parse_database_uri invocation and the db_*
assignments (db_ssl_enabled/db_pem_path/db_port) to apply these fixes.

In `@src/scheduler/entrypoint.sh`:
- Around line 57-70: The fallback branch for parse_database_uri doesn't
initialise port (so db_port can be empty) and the script never re-parses after
DATABASE_URI is replaced; update the error path in parse_database_uri handling
to set port alongside type/database/host/user/ssl_enabled/pem_path (e.g.
port="(unknown)" or default port), and after you modify/replace DATABASE_URI
re-run parse_database_uri (call parse_database_uri and eval its output) before
assigning db_ssl_enabled/db_pem_path/db_port so that db_* variables reflect the
latest parse results; reference parse_database_uri, port, ssl_enabled, pem_path,
type, database, host, user, db_port, db_ssl_enabled and db_pem_path when making
the changes.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: e05637c7-68c1-43c3-928f-6b2995f9dd92

📥 Commits

Reviewing files that changed from the base of the PR and between 24ed72b and bf358b3.

📒 Files selected for processing (5)

src/common/db/Database.py
src/common/helpers/db-tls-validate.sh
src/linux/scripts/bunkerweb-scheduler.sh
src/scheduler/entrypoint.sh
src/scheduler/main.py

📜 Review details

🧰 Additional context used

📓 Path-based instructions (2)

**/*.py

⚙️ CodeRabbit configuration file

**/*.py: Follow BunkerWeb's Python standards and security posture:

Use snake_case for functions and variables, PascalCase for classes, and provide concise, accurate docstrings for public classes, functions, and methods.

Respect Black formatting with a 160-character line limit and the existing pre-commit conventions. Do not insist on adding type annotations to previously untyped code, but accept them when added consistently.

Catch specific exceptions; never use bare except:. Catching Exception is acceptable only at explicit process boundaries (for example scheduler loops, outer job runners, worker entrypoints, or graceful-shutdown boundaries) when the code logs enough context and either re-raises, returns an error status, or terminates safely.

Never use os.system, subprocess.*(..., shell=True), eval, or exec. Pass subprocess arguments as a list and prefer explicit binary paths for privileged operations.

Do not use unsafe deserialisers (pickle, marshal, shelve, jsonpickle, dill) for untrusted data. Use yaml.safe_load() rather than unsafe YAML loading.

Open files with an explicit encoding (normally utf-8) and use with statements for files, sockets, database sessions, and temporary resources.

Use secrets for token generation and hmac.compare_digest for token, HMAC, or signature comparisons.

For HTTP clients (requests, httpx): always set an explicit timeout, validate destination URLs to block RFC1918/loopback/link-local ranges (SSRF), disable automatic redirects to internal hosts, and be careful with proxy settings.

For filesystem operations: resolve paths with Path.resolve() and verify they remain under the intended base directory before reading or writing (path traversal).

Use defusedxml rather than stdlib XML parsers for untrusted XML.

For SQLAlchemy, use bound parameters and safe query construction. Never call text() with f-strings or .format(), and flag .execute() calls built...

Files:

src/scheduler/main.py
src/common/db/Database.py

**/*.sh

⚙️ CodeRabbit configuration file

**/*.sh: Shell scripts must match BunkerWeb's portability expectations:

If the script is POSIX shell, prefer set -eu; if it explicitly requires Bash, use set -euo pipefail.

Quote variables and command substitutions consistently and prefer ${var} when concatenating.

Do not use Bash-only features in /bin/sh scripts.

Handle failures explicitly, use trap for cleanup where temporary files are created, and use mktemp safely.

Never use curl | sh or wget | sh; verify downloads by checksum or signature and avoid -k / --insecure.

Do not rely on inherited PATH in privileged contexts; set it explicitly where needed.

Avoid eval and unsafe command construction from untrusted data.

Files:

src/linux/scripts/bunkerweb-scheduler.sh
src/scheduler/entrypoint.sh
src/common/helpers/db-tls-validate.sh

🪛 Shellcheck (0.11.0)

src/linux/scripts/bunkerweb-scheduler.sh

[warning] 154-154: port is referenced but not assigned.

(SC2154)

src/scheduler/entrypoint.sh

[warning] 70-70: port is referenced but not assigned.

(SC2154)

src/common/helpers/db-tls-validate.sh

[style] 119-119: Use { ..; } instead of (..) to avoid subshell overhead.

(SC2235)

[style] 126-126: Use { ..; } instead of (..) to avoid subshell overhead.

(SC2235)

[info] 281-281: Double quote to prevent globbing and word splitting.

(SC2086)

[info] 318-318: Double quote to prevent globbing and word splitting.

(SC2086)

🔇 Additional comments (1)

src/scheduler/main.py (1)

688-709: Nice simplification of scheduler DB initialisation.

Reusing _db_uri for both the target log and Database(...) removes the duplicated fallback expression and keeps the startup path consistent.

coderabbitai · 2026-04-10T08:16:47Z

+            try:
+                from sqlalchemy.engine.url import make_url
+
+                u = make_url(uri)
+                if u.password:
+                    u = u.set(password="***")
+                return str(u)
+            except Exception:
+                # Fallback: mask :password@ in authority section
+                return re_sub(r"(//[^:/@]+:)[^@]*@", r"\1***@", uri)


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

cat -n src/common/db/Database.py | head -230 | tail -30

Repository: bunkerity/bunkerweb

Length of output: 1582

🏁 Script executed:

cat -n src/common/db/Database.py | sed -n '210,300p'

Repository: bunkerity/bunkerweb

Length of output: 4407

🏁 Script executed:

# Check what exceptions make_url can raise rg "def make_url" --type py -A 5

Repository: bunkerity/bunkerweb

Length of output: 45

🏁 Script executed:

# Check the full function context for _mask_db_uri ast-grep --pattern 'def _mask_db_uri($$$) { $$$ }'

Repository: bunkerity/bunkerweb

Length of output: 45

Catch specific exceptions from SQLAlchemy in helper functions, not Exception.

Lines 221 and 295 catch broad Exception in nested helper functions (_mask_db_uri() and _ssl_expectations()), which hides unexpected parser faults and masks programming errors. These are not process boundaries; they should catch only the exceptions make_url() actually raises.

SQLAlchemy's make_url() raises ArgumentError for invalid URIs. Catch that explicitly, and let other exceptions surface:

except ArgumentError: # Fallback: mask :password@ in authority section return re_sub(r"(//[^:/@]+:)[^@]*@", r"\1***@", uri)

This aligns with the coding guideline: "Catch specific exceptions; never use bare except:. Catching Exception is acceptable only at explicit process boundaries (scheduler loops, job runners, worker entrypoints, graceful-shutdown boundaries)."

Applies to:

Line 221 in _mask_db_uri()

Line 295 in _ssl_expectations()

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/common/db/Database.py` around lines 214 - 223, The helper functions _mask_db_uri and _ssl_expectations currently catch broad Exception around sqlalchemy.engine.url.make_url(uri); change those handlers to catch SQLAlchemy's specific ArgumentError instead (import ArgumentError from sqlalchemy.exc) so only invalid-URI parsing falls back to the existing masking logic and other unexpected errors propagate; keep the existing fallback masking behavior unchanged.

coderabbitai · 2026-04-10T08:16:47Z

+                        self.logger.error(
+                            "Database connection failed with SSL/TLS enabled but no root CA configured. "
+                            "Reason: %s",
+                            e,
+                        )
+                        self.logger.error(
+                            "Set a CA file in DATABASE_URI (ssl_ca=... for MySQL/MariaDB, sslrootcert=... for PostgreSQL), "
+                            "or disable SSL for testing only."
+                        )


🧹 Nitpick | 🔵 Trivial

Use warning/debug for pre-timeout TLS hints.

These lines run inside the retry loop, so a transient startup failure can still emit error-level events even when the next retry succeeds. That is likely to create false-positive startup alerts.

Suggested change

- self.logger.error( + self.logger.warning( "Database connection failed with SSL/TLS enabled but no root CA configured. " "Reason: %s", e, ) - self.logger.error( + self.logger.warning( "Set a CA file in DATABASE_URI (ssl_ca=... for MySQL/MariaDB, sslrootcert=... for PostgreSQL), " "or disable SSL for testing only." )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

self.logger.error(

"Database connection failed with SSL/TLS enabled but no root CA configured. "

"Reason: %s",

e,

)

self.logger.error(

"Set a CA file in DATABASE_URI (ssl_ca=... for MySQL/MariaDB, sslrootcert=... for PostgreSQL), "

"or disable SSL for testing only."

)

self.logger.warning(

"Database connection failed with SSL/TLS enabled but no root CA configured. "

"Reason: %s",

e,

)

self.logger.warning(

"Set a CA file in DATABASE_URI (ssl_ca=... for MySQL/MariaDB, sslrootcert=... for PostgreSQL), "

"or disable SSL for testing only."

)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/common/db/Database.py` around lines 333 - 341, The TLS hint logs inside the retry loop currently use self.logger.error and should be lowered to avoid false-positive alerts; change the two self.logger.error calls in the retrying connection path (the block that logs "Database connection failed with SSL/TLS enabled..." and the subsequent instruction about setting ssl_ca/sslrootcert) to self.logger.warning or self.logger.debug (prefer warning for actionable hints, debug if very verbose) and ensure only the final failure after exhausting retries still logs as error; locate these calls in the Database class connection/retry method (the method that performs the retry loop and currently calls self.logger.error with the exception variable e) and update them accordingly.

coderabbitai · 2026-04-10T08:16:47Z

+                    self.logger.error(
+                        "Can't connect to database after %s seconds: %s (DATABASE_URI=%s)",
+                        DATABASE_RETRY_TIMEOUT,
+                        e,
+                        _mask_db_uri(self.database_uri or self.database_uri_readonly),
+                    )


⚠️ Potential issue | 🟠 Major

Stop logging DATABASE_URI in these paths.

Line 356, Line 373 and Line 380 still emit the connection URI, which the repo rules require you to scrub from logs. They also pick self.database_uri after the read-only fallback because self.database_uri or self.database_uri_readonly always prefers the primary target, so the message can identify the wrong backend during a dual-endpoint failure. Log a non-secret target label plus the backend name instead.

Suggested change

+ current_uri = self.database_uri_readonly if fallback or self.readonly else (self.database_uri or self.database_uri_readonly) + current_target = "readonly" if fallback or self.readonly else "primary" + current_backend = current_uri.split(":", 1)[0] if current_uri else "unknown" self.logger.error( - "Can't connect to database after %s seconds: %s (DATABASE_URI=%s)", + "Can't connect to %s database target (%s) after %s seconds: %s", + current_target, + current_backend, DATABASE_RETRY_TIMEOUT, e, - _mask_db_uri(self.database_uri or self.database_uri_readonly), ) @@ self.logger.warning( - "Can't connect to database (%s). Retrying in 5 seconds ... (DATABASE_URI=%s)", + "Can't connect to %s database target (%s): %s. Retrying in 5 seconds ...", + current_target, + current_backend, e, - _mask_db_uri(self.database_uri or self.database_uri_readonly), ) @@ self.logger.error( - "Error when trying to connect to the database: %s (DATABASE_URI=%s)", + "Error when trying to connect to the %s database target (%s): %s", + current_target, + current_backend, e, - _mask_db_uri(self.database_uri or self.database_uri_readonly), )

As per coding guidelines, "Scrub secrets, tokens, cookies, database URIs, and Authorization headers from logs. Use the logging framework rather than print()."

Also applies to: 370-381

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/common/db/Database.py` around lines 352 - 357, The error logs in Database (self.logger.error calls that currently include DATABASE_URI) must stop emitting the full connection URI and must not prefer self.database_uri over the readonly URI when reporting which target failed; update each logger.error (the ones after retry failures using DATABASE_RETRY_TIMEOUT and in the dual-endpoint failure paths) to omit the URI and instead log a non-secret target label plus the backend name (use a clear label like "primary" or "readonly" derived from whether the attempt was against self.database_uri or self.database_uri_readonly, and include self.database_backend or similar backend identifier). Ensure the message uses DATABASE_RETRY_TIMEOUT and the caught exception e, but replace any _mask_db_uri(...) or direct URI interpolation with the non-secret target label and backend name.

coderabbitai · 2026-04-10T08:16:47Z

+print(f'type="{db_type or "(unknown)"}"')
+print(f'database="{db_name or "(none)"}"')
+print(f'host="{host or "(local)"}"')
+print(f'port="{port or "(default)"}"')
+print(f'user="{user or "(none)"}"')
+print(f'ssl_enabled="{ssl_enabled}"')
+print(f'pem_path="{pem_path or "(none)"}"')


⚠️ Potential issue | 🔴 Critical

Shell-escape these assignments before callers eval them.

These values come from DATABASE_URI, but the helper prints them inside double quotes and both callers immediately eval the result. A crafted username, database name, or ssl_ca value containing $(...) will execute during startup.

Suggested fix

-from urllib.parse import urlparse, parse_qs, unquote +from urllib.parse import urlparse, parse_qs, unquote +from shlex import quote @@ -print(f'type="{db_type or "(unknown)"}"') -print(f'database="{db_name or "(none)"}"') -print(f'host="{host or "(local)"}"') -print(f'port="{port or "(default)"}"') -print(f'user="{user or "(none)"}"') -print(f'ssl_enabled="{ssl_enabled}"') -print(f'pem_path="{pem_path or "(none)"}"') +def _emit(name: str, value: str) -> None: + print(f"{name}={quote(value)}") + +_emit("type", db_type or "(unknown)") +_emit("database", db_name or "(none)") +_emit("host", host or "(local)") +_emit("port", port or "(default)") +_emit("user", user or "(none)") +_emit("ssl_enabled", ssl_enabled) +_emit("pem_path", pem_path or "(none)")

As per coding guidelines, "Avoid eval and unsafe command construction from untrusted data."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/common/helpers/db-tls-validate.sh` around lines 76 - 82, The printed assignments for variables (db_type, db_name, host, port, user, ssl_enabled, pem_path) are emitted raw but are later consumed via eval; update the print statements to emit shell-escaped values so untrusted content cannot inject commands—use a shell-quoting helper such as Python's shlex.quote (or an equivalent escape routine) to wrap each value before printing (e.g., replace the current print(...) calls with prints that output safely-quoted assignments using shlex.quote on db_type, db_name, host, port, user, ssl_enabled, and pem_path).

coderabbitai · 2026-04-10T08:16:47Z

+    # PEM/DER format validation
+    if ! timeout 2s openssl x509 -in "$pem_path" -noout >/dev/null 2>&1; then
+        # Try DER format
+        if ! timeout 2s openssl x509 -inform DER -in "$pem_path" -noout >/dev/null 2>&1; then
+            log "$LOG_CONTEXT" "❌" "PEM certificate is not valid X.509 format: $pem_path"
+            return 1
+        fi
+    fi
+
+    # Certificate expiry validation - skip for system CA bundles (they may contain expired certs)
+    if [ "$is_system_ca" != "yes" ]; then
+        local cert_check
+        cert_check=$(timeout 2s env PEM_FILE="$pem_path" python3 - << 'PY'
+from os import getenv
+from pathlib import Path
+from datetime import datetime, timezone
+from cryptography import x509
+
+try:
+    pem_file = getenv("PEM_FILE", "")
+    data = Path(pem_file).read_bytes()
+    certs = x509.load_pem_x509_certificates(data)


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

head -n 240 src/common/helpers/db-tls-validate.sh | tail -n +195

Repository: bunkerity/bunkerweb

Length of output: 1501

🏁 Script executed:

sed -n '213,245p' src/common/helpers/db-tls-validate.sh

Repository: bunkerity/bunkerweb

Length of output: 936

🌐 Web query:

cryptography x509 load_der_x509_certificate load_pem_x509_certificates Python

💡 Result:

In Python’s cryptography library, these X.509 loaders differ by encoding (PEM vs DER) and how many certs they can parse:

DER (binary ASN.1) → single cert

from cryptography import x509 der_bytes = open("cert.der", "rb").read() cert = x509.load_der_x509_certificate(der_bytes) # -> x509.Certificate

(cryptography.io)

PEM (base64 text with BEGIN/END markers) → single cert

from cryptography import x509 pem_bytes = open("cert.pem", "rb").read() cert = x509.load_pem_x509_certificate(pem_bytes) # -> x509.Certificate

(cryptography.io)

PEM chain/bundle (multiple adjacent PEM blocks) → multiple certs

from cryptography import x509 pem_chain_bytes = open("chain.pem", "rb").read() certs = x509.load_pem_x509_certificates(pem_chain_bytes) # -> list[x509.Certificate]

This is specifically intended for “multiple certificates (as adjacent PEMs) at once.” (cryptography.io)

Citations:

1: https://cryptography.io/en/3.4.8/hazmat/backends/interfaces.html?utm_source=openai

2: https://cryptography.io/en/latest/x509/reference/?utm_source=openai

3: https://cryptography.io/en/latest/x509/reference/?utm_source=openai

DER certificates pass shell validation but fail Python expiry check unconditionally.

Whilst lines 204–206 accept DER format via openssl x509 -inform DER, line 224 calls only x509.load_pem_x509_certificates(data), which raises ValueError when given DER input. The exception is caught at line 238 and prints ERROR, causing the validation to fail.

Fix required

Attempt PEM first, then fall back to DER on failure:

- certs = x509.load_pem_x509_certificates(data) + try: + certs = list(x509.load_pem_x509_certificates(data)) + except ValueError: + certs = [x509.load_der_x509_certificate(data)]

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# PEM/DER format validation

if ! timeout 2s openssl x509 -in "$pem_path" -noout >/dev/null 2>&1; then

# Try DER format

if ! timeout 2s openssl x509 -inform DER -in "$pem_path" -noout >/dev/null 2>&1; then

log "$LOG_CONTEXT" "❌" "PEM certificate is not valid X.509 format: $pem_path"

return 1

fi

fi

# Certificate expiry validation - skip for system CA bundles (they may contain expired certs)

if [ "$is_system_ca" != "yes" ]; then

local cert_check

cert_check=$(timeout 2s env PEM_FILE="$pem_path" python3 - << 'PY'

from os import getenv

from pathlib import Path

from datetime import datetime, timezone

from cryptography import x509

try:

pem_file = getenv("PEM_FILE", "")

data = Path(pem_file).read_bytes()

certs = x509.load_pem_x509_certificates(data)

# PEM/DER format validation

if ! timeout 2s openssl x509 -in "$pem_path" -noout >/dev/null 2>&1; then

# Try DER format

if ! timeout 2s openssl x509 -inform DER -in "$pem_path" -noout >/dev/null 2>&1; then

log "$LOG_CONTEXT" "❌" "PEM certificate is not valid X.509 format: $pem_path"

return 1

fi

fi

# Certificate expiry validation - skip for system CA bundles (they may contain expired certs)

if [ "$is_system_ca" != "yes" ]; then

local cert_check

cert_check=$(timeout 2s env PEM_FILE="$pem_path" python3 - << 'PY'

from os import getenv

from pathlib import Path

from datetime import datetime, timezone

from cryptography import x509

try:

pem_file = getenv("PEM_FILE", "")

data = Path(pem_file).read_bytes()

try:

certs = list(x509.load_pem_x509_certificates(data))

except ValueError:

certs = [x509.load_der_x509_certificate(data)]

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/common/helpers/db-tls-validate.sh` around lines 203 - 224, The Python expiry-check block (started in cert_check=$(... python3 - << 'PY') only calls x509.load_pem_x509_certificates(data) which fails for DER input; update the code inside that heredoc to first try loading PEM via x509.load_pem_x509_certificates(Path(pem_file).read_bytes()) and on ValueError fall back to loading DER (use x509.load_der_x509_certificate for a single cert or decode multiple DER certs if needed), then proceed to compute expiries as before; ensure exceptions are still caught and that the PEM_FILE/ pem_path variable names (PEM_FILE, pem_path) and the cert_check variable remain unchanged so the surrounding shell logic continues to work.

coderabbitai · 2026-04-10T08:16:48Z

+        local direct_cmd="openssl s_client -connect $host:$port"
+        [ -f "$pem_path" ] && direct_cmd="$direct_cmd -CAfile $pem_path"
+        case "$db_type" in
+            postgres) direct_cmd="$direct_cmd -starttls postgres" ;;
+            mysql|mariadb) direct_cmd="$direct_cmd -starttls mysql" ;;
+        esac
+
+        log "$LOG_CONTEXT" "ℹ️" "No DNS resolution for ${host}, testing direct connection"
+        if echo "Q" | timeout 2s $direct_cmd >/dev/null 2>&1; then


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n src/common/helpers/db-tls-validate.sh | sed -n '265,320p'

Repository: bunkerity/bunkerweb

Length of output: 2473

🏁 Script executed:

# Check openssl s_client documentation for IPv6 handling man openssl-s_client 2>/dev/null | grep -A 5 "\-connect" | head -20

Repository: bunkerity/bunkerweb

Length of output: 45

🏁 Script executed:

# Search for calls to this validation function to see if IPv6 might be passed rg "db_tls_validate|db-tls-validate" --type sh -B 2 -A 2

Repository: bunkerity/bunkerweb

Length of output: 857

🏁 Script executed:

# Check for IPv6 address patterns or usage in the codebase rg "[0-9a-fA-F:]{2,}" --type sh | head -10

Repository: bunkerity/bunkerweb

Length of output: 824

🏁 Script executed:

# Find the function definition of db_tls_validate rg "db_tls_validate()" src/common/helpers/db-tls-validate.sh -A 10

Repository: bunkerity/bunkerweb

Length of output: 45

🏁 Script executed:

# Check openssl s_client help for IPv6 syntax requirements openssl s_client -help 2>&1 | grep -i connect || echo "Help not available, checking online docs"

Repository: bunkerity/bunkerweb

Length of output: 103

🏁 Script executed:

# Find the function definition grep -n "^db_tls_validate" src/common/helpers/db-tls-validate.sh

Repository: bunkerity/bunkerweb

Length of output: 45

🏁 Script executed:

# Check how the host parameter is derived sed -n '1,272p' src/common/helpers/db-tls-validate.sh | tail -50

Repository: bunkerity/bunkerweb

Length of output: 1584

🌐 Web query:

openssl s_client -connect IPv6 address syntax documentation

💡 Result:

For openssl s_client, when the host is an IPv6 literal, you must wrap the address in square brackets so OpenSSL can still parse the host:port separator:

openssl s_client -connect "[2001:db8::1]:443" # or (positional target form) openssl s_client "[2001:db8::1]:443"

This same [...] rule also applies to options like -bind and -proxy when their host is an IPv6 address. (docs.openssl.org)

Citations:

1: https://docs.openssl.org/3.6/man1/openssl-s_client/

🏁 Script executed:

# Find where the function is defined and its parameters head -270 src/common/helpers/db-tls-validate.sh | grep -n "^db_tls_validate\|^function db_tls_validate" -A 5

Repository: bunkerity/bunkerweb

Length of output: 45

🏁 Script executed:

# Search for the function definition more broadly grep -n "db_tls_validate" src/common/helpers/db-tls-validate.sh | head -5

Repository: bunkerity/bunkerweb

Length of output: 45

🏁 Script executed:

# Check the complete beginning of the file for function signature head -50 src/common/helpers/db-tls-validate.sh

Repository: bunkerity/bunkerweb

Length of output: 1715

🏁 Script executed:

# Find validate_db_tls_certificate function definition grep -n "validate_db_tls_certificate" src/common/helpers/db-tls-validate.sh | head -3

Repository: bunkerity/bunkerweb

Length of output: 296

🏁 Script executed:

# Get the function signature sed -n '180,270p' src/common/helpers/db-tls-validate.sh | head -100

Repository: bunkerity/bunkerweb

Length of output: 3022

🏁 Script executed:

# Get the full function definition line sed -n '149,160p' src/common/helpers/db-tls-validate.sh

Repository: bunkerity/bunkerweb

Length of output: 751

🏁 Script executed:

# Check how the host parameter is obtained and if IPv6 is possible sed -n '40,150p' src/common/helpers/db-tls-validate.sh

Repository: bunkerity/bunkerweb

Length of output: 4354

Fix IPv6 literal handling and unsafe command construction in openssl invocations.

Lines 273–281 and 300–318 construct -connect targets as unquoted ${host}:${port} and ${ip}:${port} strings. This breaks IPv6 literals (Python's urlparse().hostname returns bare addresses like 2001:db8::1 without brackets; openssl requires [2001:db8::1]:5432), and unquoted command strings split paths containing spaces. Refactor to use arrays with a helper function to format targets correctly:

Suggested fix

+format_connect_target() { + case "$1" in + *:*) printf '[%s]:%s' "$1" "$2" ;; + *) printf '%s:%s' "$1" "$2" ;; + esac +} @@ - local direct_cmd="openssl s_client -connect $host:$port" - [ -f "$pem_path" ] && direct_cmd="$direct_cmd -CAfile $pem_path" + local connect_target + connect_target=$(format_connect_target "$host" "$port") + local -a direct_cmd=(openssl s_client -connect "$connect_target") + [ -f "$pem_path" ] && direct_cmd+=(-CAfile "$pem_path") @@ - if echo "Q" | timeout 2s $direct_cmd >/dev/null 2>&1; then + if echo "Q" | timeout 2s "${direct_cmd[@]}" >/dev/null 2>&1; then @@ - local ip_cmd="openssl s_client -connect $ip:$port" + local connect_target + connect_target=$(format_connect_target "$ip" "$port") + local -a ip_cmd=(openssl s_client -connect "$connect_target") @@ - ip_cmd="$ip_cmd -CAfile $pem_path" + ip_cmd+=(-CAfile "$pem_path") @@ - ip_cmd="$ip_cmd -starttls postgres" + ip_cmd+=(-starttls postgres) @@ - ip_cmd="$ip_cmd -starttls mysql" + ip_cmd+=(-starttls mysql) @@ - if echo "Q" | timeout 2s $ip_cmd >/dev/null 2>&1; then + if echo "Q" | timeout 2s "${ip_cmd[@]}" >/dev/null 2>&1; then

This complies with the coding guidelines: proper quoting of variables, avoiding unsafe command construction, and defending against word-splitting from spaces in paths or IPv6 colons.

🧰 Tools

🪛 Shellcheck (0.11.0)

[info] 281-281: Double quote to prevent globbing and word splitting.

(SC2086)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/common/helpers/db-tls-validate.sh` around lines 273 - 281, The openssl invocations build unsafe unquoted command strings and do not handle IPv6 literals; update the code that constructs direct_cmd and the later ip-based command to (1) add a helper (e.g., format_connect_target) that accepts host/ip and port and returns a properly bracketed target for IPv6 like "[2001:db8::1]:5432" and plain "host:port" for IPv4/hostnames, (2) stop concatenating into a single command string and instead build an array for the command (e.g., cmd=(openssl s_client -connect "$target") and append -CAfile "$pem_path" and -starttls options as separate array elements), and (3) use the array form with timeout (or exec) so variables are quoted and paths with spaces are safe; apply this change for both the direct_cmd construction around direct_cmd/direc_cmd usage and the ip-based openssl invocation in the later block. Ensure references: direct_cmd, $host/$ip/$port, and the -starttls branches (postgres, mysql|mariadb) are updated to use the helper and array-based command execution.

coderabbitai · 2026-04-10T08:16:48Z

+    if parse_output=$(parse_database_uri); then
+        eval "$parse_output"
+    else
+        log "SYSTEMCTL" "⚠️" "Failed to parse DATABASE_URI (error), using defaults"
+        type="(unknown)"
+        database="(none)"
+        host="(local)"
+        user="(none)"
+        ssl_enabled="no"
+        pem_path="(none)"
+    fi
+    db_ssl_enabled="$ssl_enabled"
+    db_pem_path="$pem_path"
+    db_port="$port"


⚠️ Potential issue | 🟡 Minor

Re-parse after Line 222 and initialise port in the fallback.

Line 222 replaces DATABASE_URI, but Line 227 still logs type/database/host from the first parse. Also, the error path on Lines 145-150 never sets port, so db_port and the later log can end up blank.

Suggested fix

else log "SYSTEMCTL" "⚠️" "Failed to parse DATABASE_URI (error), using defaults" type="(unknown)" database="(none)" host="(local)" + port="(default)" user="(none)" ssl_enabled="no" pem_path="(none)" fi @@ DATABASE_URI=$(cat /var/tmp/bunkerweb/database_uri) export DATABASE_URI rm -f /var/tmp/bunkerweb/database_uri + + if parse_output=$(parse_database_uri); then + eval "$parse_output" + else + type="(unknown)" + database="(none)" + host="(local)" + port="(default)" + user="(none)" + ssl_enabled="no" + pem_path="(none)" + fi # Log parsed database connection info log "SYSTEMCTL" "ℹ️" "Scheduler DB target: type=${type} database=${database} host=${host} port=${port} user=${user} ssl=${ssl_enabled} pem=${pem_path}"

Also applies to: 222-227

🧰 Tools

🪛 Shellcheck (0.11.0)

[warning] 154-154: port is referenced but not assigned.

(SC2154)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/linux/scripts/bunkerweb-scheduler.sh` around lines 141 - 154, The parse_database_uri fallback branch currently leaves port unset and does not re-run parsing after DATABASE_URI is replaced later; update the error path in parse_database_uri handling so it initializes port (e.g., set port="(unknown)" or a default) and ensure that after any code that replaces DATABASE_URI (the block that updates DATABASE_URI around the later lines) you call parse_database_uri again (or re-eval parse_output) so variables produced by parse_database_uri—type, database, host, user, ssl_enabled, pem_path, port—are refreshed before assigning db_ssl_enabled/db_pem_path/db_port and before logging; locate the parse_database_uri invocation and the db_* assignments (db_ssl_enabled/db_pem_path/db_port) to apply these fixes.

coderabbitai · 2026-04-10T08:16:48Z

+if parse_output=$(parse_database_uri); then
+	eval "$parse_output"
+else
+	log "ENTRYPOINT" "⚠️" "Failed to parse DATABASE_URI (error), using defaults"
+	type="(unknown)"
+	database="(none)"
+	host="(local)"
+	user="(none)"
+	ssl_enabled="no"
+	pem_path="(none)"
+fi
+db_ssl_enabled="$ssl_enabled"
+db_pem_path="$pem_path"
+db_port="$port"


⚠️ Potential issue | 🟡 Minor

Re-parse after Line 124 and initialise port in the fallback.

Line 124 replaces DATABASE_URI, but Line 129 still logs type/database/host from the first parse. Also, the error path on Lines 61-66 never sets port, so db_port and the later log can end up blank.

Suggested fix

else log "ENTRYPOINT" "⚠️" "Failed to parse DATABASE_URI (error), using defaults" type="(unknown)" database="(none)" host="(local)" + port="(default)" user="(none)" ssl_enabled="no" pem_path="(none)" fi @@ DATABASE_URI=$(cat /var/tmp/bunkerweb/database_uri) export DATABASE_URI rm -f /var/tmp/bunkerweb/database_uri + +if parse_output=$(parse_database_uri); then + eval "$parse_output" +else + type="(unknown)" + database="(none)" + host="(local)" + port="(default)" + user="(none)" + ssl_enabled="no" + pem_path="(none)" +fi # Log parsed database connection info log "ENTRYPOINT" "ℹ️" "Scheduler DB target: type=${type} database=${database} host=${host} port=${port} user=${user} ssl=${ssl_enabled} pem=${pem_path}"

Also applies to: 124-129

🧰 Tools

🪛 Shellcheck (0.11.0)

[warning] 70-70: port is referenced but not assigned.

(SC2154)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/scheduler/entrypoint.sh` around lines 57 - 70, The fallback branch for parse_database_uri doesn't initialise port (so db_port can be empty) and the script never re-parses after DATABASE_URI is replaced; update the error path in parse_database_uri handling to set port alongside type/database/host/user/ssl_enabled/pem_path (e.g. port="(unknown)" or default port), and after you modify/replace DATABASE_URI re-run parse_database_uri (call parse_database_uri and eval its output) before assigning db_ssl_enabled/db_pem_path/db_port so that db_* variables reflect the latest parse results; reference parse_database_uri, port, ssl_enabled, pem_path, type, database, host, user, db_port, db_ssl_enabled and db_pem_path when making the changes.

Add Database Target and SSL Debug Logs for Scheduler Startup bunkerit…

bf358b3

…y#3361 Add Database Target and SSL Debug Logs for Scheduler Startup bunkerity#3361

mkf-sysangels requested a review from TheophileDiot as a code owner March 19, 2026 09:22

coderabbitai Bot reviewed Apr 10, 2026

View reviewed changes

Conversation

mkf-sysangels commented Mar 19, 2026

Uh oh!

mkf-sysangels commented Mar 19, 2026

Uh oh!

mkf-sysangels commented Mar 19, 2026

Uh oh!

TheophileDiot commented Apr 10, 2026

Uh oh!

coderabbitai Bot commented Apr 10, 2026

Uh oh!

coderabbitai Bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by Component

src/common

src/linux

src/scheduler

Impact Assessment

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai Bot commented Apr 10, 2026 •

edited

Loading