-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Add Docker Images #8668
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Docker Images #8668
Conversation
The config loader was not properly validating templates before use. Add validation logic and tests for template coverage.
Use go:embed to include the default config template directly in the binary, eliminating the need for external template files at runtime.
The embedded template in common/config is now the single source of truth for the default configuration template.
Change the default behavior of 'temporal server start' to use the embedded config template with environment variables. This simplifies setup for users who don't need custom configuration files. Add --config-file flag for explicit file-based configuration.
Add comprehensive test to validate all fields in the embedded template can be properly rendered. Refactor main.go to use new config loading pattern.
The embedded template must match the template used in the Helm chart to ensure consistent behavior across deployment methods.
Replace the multiple LoadConfig variants with a single Load function that accepts functional options. This provides a cleaner API and makes it easier to add new configuration sources.
Rename TEMPORAL_CONFIG_FILE to TEMPORAL_SERVER_CONFIG_FILE_PATH to make the variable's purpose more explicit. Update Make targets to use the new --config-file flag instead of the deprecated --env flag.
Reorganize the config loader code for better maintainability and simplify the test suite by removing redundant test cases.
Update config tests to properly handle the new loading mechanism. Fix sqlite file config to use correct path.
Ensure that existing configuration files and environment variables continue to work with the new config loader implementation.
Expose config file path as an option in the FX dependency injection provider to support programmatic configuration.
## What changed? Fixing data race by extracting out close transfer task information prior to release lock. Added more context to comment. ## Why? Bug ## How did you test it? - [x] built - [x] run locally and tested manually - [x] covered by existing tests - [ ] added new unit test(s) - [ ] added new functional test(s)
## What changed? - Revert the removal of slice count check in multi-cursor slice count action. - The check was incorrectly removed in [#8416 L117](https://github.com/temporalio/temporal/pull/8416/files#diff-deecce1e374d8c4db074d9c923c2b80d1b8e28cef778202aa01397e8d56e1bafL117) ## Why? - Without the check, the code will panic later in `pickCompactCandidates` when currentSliceCount < targetSliceCount. ## How did you test it? - [x] built - [ ] run locally and tested manually - [ ] covered by existing tests - [ ] added new unit test(s) - [ ] added new functional test(s) - [x] will follow up with a test PR
## What changed? - Wire up chasm workflow library ## Why? - #8485 makes workflow start to use CHASM as well and we need to register workflow as a chasm library. ## How did you test it? - [x] built - [ ] run locally and tested manually - [ ] covered by existing tests - [ ] added new unit test(s) - [ ] added new functional test(s) ## Potential risks - Logic is only used when chasm feature flag is turned on. The change is mainly for functional tests.
## What changed? Add locking in newRateLimitManager and Stop. ## Why? Unlikely but potential data race if we get a dynamic config subscription callback after Subscribe returns before assigning to the field in the constructor. ## How did you test it? - [x] built and ran existing tests
## What changed? - Fix last change tracking for pri/fairness tasks. - Don't allow migration for sticky queues. - Improve fairness migration tests: - Set config only on test task queue - Use only ApproximateBacklogCount ## Why? This fixes some situations where we wouldn't update ApproximateBacklogCount on partition unload. It should also make the test less flaky. ## How did you test it? - [x] covered by existing tests
## What changed?
No-oping close transfer tasks for SyncWorkflowState tasks
## Why?
We need non state based replication to be eligible for this
optimization.
## How did you test it?
- [x] built
- [x] run locally and tested manually
- [x] covered by existing tests
- [ ] added new unit test(s)
- [x] added new functional test(s)
`go test -v -tags test_dep ./tests/xdc -run
TestStreamBasedReplicationTestSuite/DisableTransitionHistory/TestCloseTransferTaskAckedReplication
-timeout 10m -count=1`
shows
`2025-10-20T08:15:08.290-0700 info Skipping close transfer task
generation - already acked on active cluster {"cluster-name":
"standby_aadnd", "host": "127.0.0.1:57179", "shard-id": 1, "address":
"127.0.0.1:57179", "wf-namespace-id":
"e550305e-0b43-4bcd-a490-8e3223f51ce1", "wf-id":
"test-replication-e2c094d3-c34f-42d9-a166-a967d4e7f602", "wf-run-id":
"019a0230-299a-74ba-a31f-ac247c05f2b9", "logging-call-at":
"/Users/michaely520/projects/temporal/service/history/workflow/task_generator.go:206"}
stream_based_replication_test.go:975: Verified IsCloseTransferTaskAcked
and IsForceReplication flags in SyncWorkflowStateTask`
Change help text to specify path is relative to current working directory rather than root directory, improving user clarity.
Update test assertions to match new config loading behavior.
|
cursor review |
| COPY --chmod=755 ./scripts/start-server.sh /etc/temporal/start-server.sh | ||
|
|
||
| USER temporal | ||
| ENTRYPOINT ["/etc/temporal/entrypoint.sh"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Missing runtime dependencies in server Dockerfile
The server.Dockerfile doesn't install any runtime dependencies, but entrypoint-server.sh uses getent which isn't available in Alpine's base image. Alpine uses musl libc instead of glibc, so getent needs to be installed or replaced with an alternative. The legacy-server.Dockerfile installs packages like bash and curl, but this Dockerfile is missing those installations entirely.
| # Setup Temporal Server in development mode if "develop" is passed as an argument. | ||
| for arg in "$@"; do | ||
| if [ "${arg}" = "develop" ]; then | ||
| /etc/temporal/setup-develop.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Missing setup scripts referenced by entrypoint
The entrypoint script references /etc/temporal/auto-setup.sh and /etc/temporal/setup-develop.sh when autosetup or develop arguments are passed, but these scripts are never copied into the Docker image in server.Dockerfile. This causes the container to fail when these arguments are used.
| tags = compact([ | ||
| "${IMAGE_REPO}/server:${IMAGE_SHA_TAG}", | ||
| "${IMAGE_REPO}/server:${SAFE_IMAGE_BRANCH_TAG}", | ||
| TAG_LATEST ? "${IMAGE_REPO}/server:latest" : "", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Docker image tag collision between targets
The admin-tools and legacy-admin-tools targets use identical tags, and server and legacy-server targets also use identical tags. When building the default group that includes all four targets, the later builds overwrite the earlier ones, making it impossible to have both variants available simultaneously. Each target needs distinct tags to avoid conflicts.
| mkdir -p "${BUILD_DIR}" | ||
|
|
||
| # Copy schema directory from repo root | ||
| cp -r "${REPO_ROOT}/schema" "${BUILD_DIR}/temporal" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Incorrect schema directory copy creates wrong path
The cp -r command copies schema to ${BUILD_DIR}/temporal, which creates different results depending on whether the destination exists. On first run when temporal doesn't exist, it creates ${BUILD_DIR}/temporal containing schema files directly. The Dockerfiles expect ./build/temporal/schema to exist, causing the build to fail. The command should use cp -r "${REPO_ROOT}/schema" "${BUILD_DIR}/temporal/" with a trailing slash or create the directory first.
| # Clone the temporal repo at the specific version | ||
| if [ ! -d "temporal" ]; then | ||
| git clone --depth 1 --branch "v${SERVER_VERSION}" https://github.com/temporalio/temporal.git | ||
| fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Stale temporal repo causes wrong tdbg version
The script only clones the temporal repository if the directory doesn't exist. When running with different SERVER_VERSION values, a cached directory from a previous version remains, causing tdbg to be built from the wrong source version. The directory check prevents fetching the correct version-specific code. The script should either remove the directory before cloning or fetch and checkout the correct version if it exists.
What changed?
Describe what has changed in this PR.
Why?
Tell your future self why have you made these changes.
How did you test it?
Potential risks
Any change is risky. Identify all risks you are aware of. If none, remove this section.
Note
Cursor Bugbot is generating a summary for commit d8f1834. Configure here.