Skip to content

Conversation

@xBlaz3kx
Copy link

@xBlaz3kx xBlaz3kx commented Nov 15, 2025

  • Added built-in healthcheck for server in Dockerfile
  • Added worker healtcheck in compose

Addressing #269

Summary by CodeRabbit

  • New Features

    • Added two HTTP endpoints: a liveliness probe (simple 200 OK) and a healthcheck that verifies database and cache and returns an aggregated JSON status (200 OK or 503).
  • Chores

    • Added container-level HEALTHCHECK probing the app on the designated port.
    • Added healthcheck configurations for web and worker services in compose to detect and surface service failures.

@coderabbitai
Copy link

coderabbitai bot commented Nov 15, 2025

Walkthrough

Adds application health endpoints (healthcheck and liveliness), exposes them via routes, and implements container-level and compose-level healthchecks; also includes a minor Dockerfile formatting tweak.

Changes

Cohort / File(s) Summary
Health controller & routes
app/controllers/health_controller.rb, config/routes.rb
Add HealthController with healthcheck (performs DB and Redis checks, returns JSON; 200 if all pass, otherwise 503) and liveliness (returns 200 no body). Add routes: GET /healthcheckhealth#healthcheck, GET /livelinesshealth#liveliness.
Dockerfile changes
Dockerfile
Add HEALTHCHECK to final image that curls http://localhost:3000/healthcheck; remove a trailing space before a line continuation in apt-get install.
Compose example healthchecks
compose.example.yml
Add healthcheck blocks for web (HTTP test against /healthcheck) and worker (healthcheck with configured test/interval/timeout/retries/start_period).

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Docker as Docker / Compose
    participant Container as App Container
    participant Rails as Rails App
    participant DB as Database
    participant Redis as Redis

    Note over Docker,Container `#DDEBF7`: Container health probe (Dockerfile / compose)
    Docker->>Container: HTTP GET /healthcheck
    alt 200 OK
        Container->>Docker: 200 OK
    else non-200 / no response
        Container->>Docker: failure
    end

    Note over Container,Rails `#F6F8E9`: Rails healthcheck logic
    Container->>Rails: handle /healthcheck
    Rails->>DB: SELECT 1
    DB-->>Rails: OK / error
    Rails->>Redis: PING
    Redis-->>Rails: PONG / error
    Rails-->>Container: JSON {checks...}, status 200 or 503
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Focus review on: app/controllers/health_controller.rb (DB/Redis error handling, logging silencing), Dockerfile HEALTHCHECK syntax and timing, and compose.example.yml healthcheck commands/assumptions (tooling available in images).

Possibly related PRs

Suggested labels

infra/tooling

Suggested reviewers

  • jjmata

Poem

🐰 I hop and probe each container's heart,
Curl and ping and pgrep play their part.
DB answers quick, Redis gives a cheer,
Liveliness whispers: “I am here.”
🥕 Healthchecks set — the rabbit's work is clear.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding Docker healthchecks for both the worker and server components.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

📝 Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

  • Provide your own instructions using the high_level_summary_instructions setting.
  • Format the summary however you like (bullet lists, tables, multi-section layouts, contributor stats, etc.).
  • Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example instruction:

"Divide the high-level summary into five sections:

  1. 📝 Description — Summarize the main change in 50–60 words, explaining what was done.
  2. 📓 References — List relevant issues, discussions, documentation, or related PRs.
  3. 📦 Dependencies & Requirements — Mention any new/updated dependencies, environment variable changes, or configuration updates.
  4. 📊 Contributor Summary — Include a Markdown table showing contributions:
    | Contributor | Lines Added | Lines Removed | Files Changed |
  5. ✔️ Additional Notes — Add any extra reviewer context.
    Keep each section concise (under 200 words) and use bullet or numbered lists for clarity."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3c8ba64 and 49fbcaf.

📒 Files selected for processing (2)
  • Dockerfile (2 hunks)
  • compose.example.yml (2 hunks)
🔇 Additional comments (2)
Dockerfile (1)

61-62: HEALTHCHECK configuration is well-configured.

The healthcheck correctly probes the application on localhost:3000 using curl with the --fail flag, and the timing parameters (30s interval, 5s timeout, 10s start-period, 3 retries) are sensible defaults. The curl binary is available in the final stage since it's installed in the base image (line 12).

compose.example.yml (1)

83-88: Worker service healthcheck is well-configured.

The process-based health check using pgrep -f sidekiq is a reasonable approach to monitor the Sidekiq worker, and the timing parameters (30s interval, 5s timeout, 3 retries, 10s start-period) are appropriate for this type of check.

@jjmata jjmata self-requested a review November 15, 2025 09:02
@jjmata
Copy link
Collaborator

jjmata commented Nov 15, 2025

Thanks for sending this in, @xBlaz3kx! Love all the small "quality of life" improvements like this ...

My main concern with this is simple, but I think you'll appreciate given your line of work: logs become a mess when you do HEALTHCHECK against HTTP, because you have a snippet like this every so often that pollutes them, from my Grafana:

Screen Shot 2025-11-15 at 10 08 07 AM

So you can see that I at least moved kube-probe to hit /up instead of / (which does a redirect and a lot more work from what I remember rendering /sessions/new unnecessarily ... but most recently I've asked that we move the HEALTHCHECK in this k8s deployment to a TCP check instead.

Want to rewrite it like that? Not sure what a good "compose friendly" way of doing it would be. 🤷‍♂️

Copy link
Collaborator

@jjmata jjmata left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets decide if we go all the way to an HTTP GET health check or we can do something more nimble.

# Entrypoint prepares the database.
ENTRYPOINT ["/rails/bin/docker-entrypoint"]

HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than curl a full HTTP request can we do a TCP check to make sure the container is up? Always hard to decide how far to decide "healthy" app ... roundtrip to the DB included?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually, you have a liveliness check and a healthcheck endpoint. Liveliness determines if the application is up and running. Health check endpoint would check if its running as expected (connected to db, message queues, etc).

I'm not faimiliar with Ruby, so I couldnt dig up the appropriate endpoint to use. Perhaps OpenAPI spec would be a nice addition to the docs.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For compose, we can just use the liveliness endpoint, since it will most likely not boot if migrations arent applied correctly. For Kubernetes, migrations should probably be ran before the main application startup (in a separate, sidecar container).

@xBlaz3kx
Copy link
Author

xBlaz3kx commented Nov 15, 2025

I agree, polluting logs with healtchecks is a nightmare, specially when hitting the / endpoint. I'm not exactly familiar with all the available endpoints - is /up something like a /healthz endpoint?

There are other ways to go around this, for example if you are using Promtail, you could always filter out logs that are hitting the healtcheck endpoint.

Ill try to find a better solution for the healthcheck, maybe I can add dedicated health endpoints, and we can skip logs on those.

@xBlaz3kx xBlaz3kx force-pushed the chore/docker-healthcheck branch from 49fbcaf to 226dda2 Compare November 17, 2025 17:46
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
Dockerfile (1)

61-62: Consider using /liveliness endpoint to reduce log pollution.

Based on the PR discussion, HTTP healthcheck log pollution is a concern. The /liveliness endpoint would be more appropriate for the Docker HEALTHCHECK because:

  • It silences logs (via silence_logger in the controller), addressing the log pollution concern raised by the maintainer
  • It's a lightweight check that only verifies the app is running (appropriate for container-level liveness)
  • The /healthcheck endpoint performs deeper checks (database, Redis) that may be unnecessarily heavy for frequent container health probes

Apply this diff to use the liveliness endpoint:

-HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
-  CMD curl --fail http://localhost:3000/healthcheck || exit 1
+HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
+  CMD curl --fail http://localhost:3000/liveliness || exit 1
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 49fbcaf and 226dda2.

📒 Files selected for processing (4)
  • Dockerfile (2 hunks)
  • app/controllers/health_controller.rb (1 hunks)
  • compose.example.yml (2 hunks)
  • config/routes.rb (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • compose.example.yml
🔇 Additional comments (6)
app/controllers/health_controller.rb (5)

1-9: LGTM! Well-designed health endpoint setup.

The controller appropriately:

  • Skips authentication and CSRF for health endpoints
  • Applies silence_logger only to /liveliness to reduce log noise from frequent container probes
  • Keeps logging enabled for /healthcheck so that actual health failures are visible

This design aligns with the PR discussion about minimizing log pollution while maintaining observability for real issues.


11-21: LGTM! Proper healthcheck implementation.

The healthcheck action correctly:

  • Aggregates multiple dependency checks (database and Redis)
  • Returns appropriate HTTP status codes (200 OK vs 503 Service Unavailable)
  • Provides detailed JSON response for debugging

23-25: LGTM! Lightweight liveliness probe.

The liveliness endpoint appropriately returns a simple 200 OK with no body, suitable for frequent container health checks.


29-35: LGTM! Database check is correct.

The database check performs a lightweight SELECT 1 query and properly handles errors.


48-54: LGTM! Proper logger silencing implementation.

The silence_logger helper correctly preserves the original log level and restores it in an ensure block.

config/routes.rb (1)

336-338: LGTM! Health routes properly configured.

The new routes are correctly placed alongside the existing Rails health check route and properly map to the HealthController actions.

@xBlaz3kx xBlaz3kx force-pushed the chore/docker-healthcheck branch from 226dda2 to ae41499 Compare November 18, 2025 22:54
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
app/controllers/health_controller.rb (3)

23-25: Simple and appropriate liveness probe.

The minimal implementation is ideal for a liveness endpoint that will be polled frequently by monitoring tools.

Optional: Consider naming convention. Kubernetes uses "liveness" rather than "liveliness" for probe endpoints. While "liveliness" is valid, "liveness" is more conventional in the container orchestration ecosystem.


37-43: Past review concern addressed correctly.

The implementation now uses Sidekiq.redis(&:ping) which correctly leverages the application's configured Redis connection pool, addressing the previous review feedback.

Consider broadening exception handling for consistency. The check_database method catches StandardError, but this method only catches Redis::CannotConnectError. Other Redis errors (timeouts, connection errors, etc.) might not be caught. For consistency and defensive coding, consider catching StandardError here as well.

Apply this diff for consistency with check_database:

   def check_redis
     Sidekiq.redis(&:ping)
     true
-  rescue Redis::CannotConnectError => e
+  rescue StandardError => e
     Rails.logger.error("Redis health check failed: #{e.message}")
     false
   end

Based on learnings


45-51: Correct implementation with proper cleanup.

The manual logger level manipulation with ensure block correctly silences logs and guarantees restoration of the original level.

Optional: Rails has built-in log silencing. While your implementation works well, Rails provides LoggerSilence concern with a silence method. However, your explicit implementation is clear and correct.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 226dda2 and ae41499.

📒 Files selected for processing (4)
  • Dockerfile (2 hunks)
  • app/controllers/health_controller.rb (1 hunks)
  • compose.example.yml (2 hunks)
  • config/routes.rb (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • compose.example.yml
  • Dockerfile
🧰 Additional context used
🪛 GitHub Check: ci / lint
app/controllers/health_controller.rb

[failure] 53-53:
Layout/TrailingEmptyLines: 1 trailing blank lines detected.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: ci / test
🔇 Additional comments (5)
app/controllers/health_controller.rb (4)

1-6: LGTM! Appropriate callback skips for health endpoints.

Skipping authentication and setup callbacks is correct for health check endpoints, ensuring they remain fast and accessible for monitoring tools.


8-9: Good solution to the log pollution concern.

Silencing logs only for the liveliness endpoint addresses the issue raised in the PR discussion about healthcheck-induced log clutter while keeping detailed logging for the healthcheck endpoint.


11-21: LGTM! Standard healthcheck pattern with proper status codes.

The implementation correctly aggregates individual health checks and returns appropriate HTTP status codes (200 for healthy, 503 for unhealthy) along with detailed JSON for debugging.


29-35: LGTM! Efficient database health check.

Using SELECT 1 is an efficient and standard approach for database health checks. The broad exception handling with error logging is appropriate.

config/routes.rb (1)

336-338: LGTM! Well-placed and correctly configured routes.

The new health check routes are properly defined and logically placed near the existing Rails health check endpoint, making them easy to find and maintain.

Rails.logger.level = old_level
end
end

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove trailing blank line to fix linter error.

The linter detected a trailing blank line that should be removed to comply with the project's formatting standards.

Apply this diff:

 end
-
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
end
🧰 Tools
🪛 GitHub Check: ci / lint

[failure] 53-53:
Layout/TrailingEmptyLines: 1 trailing blank lines detected.

🤖 Prompt for AI Agents
In app/controllers/health_controller.rb around line 53, there is a trailing
blank line causing a linter error; remove the empty line at the end of the file
so the file ends directly after the previous content (no extra blank line), save
and re-run the linter to confirm the error is resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants