Encryption Key Rotation Support by alstanchev · Pull Request #2350 · eclipse-ditto/ditto

alstanchev · 2026-02-23T14:09:15Z

This PR implements encryption key rotation for Ditto's connectivity service using a dual-key approach. The feature enables secure rotation
of AES-256-GCM encryption keys for sensitive connection data (credentials, URIs) without service interruption.

Key Features:

Dual-key configuration: Support for current encryption key (symmetrical-key) and fallback key (old-symmetrical-key) with automatic fallback during
decryption
Migration started via piggyback commands: DevOps-triggered migration that re-encrypts existing MongoDB persistence data from old key to new key in configurable
batches
Progress tracking: Migration progress monitoring with resumption support for failed migrations
Encryption disable workflow: Safe migration path to decrypt all data and disable encryption entirely
Validation and safety: Configuration validation prevents misconfiguration; dry-run mode allows testing before actual migration

Implements #2340

Implement dual-key configuration and fallback decryption to enable safe encryption key rotation without downtime or data loss. Signed-off-by: Aleksandar Stanchev <[email protected]>

Signed-off-by: Aleksandar Stanchev <[email protected]>

thjaeckle

PR Review: Encryption Key Rotation Support

Note: This review was supported with the help of an LLM (Claude Code).

+5522 / -48 | 33 files changed

Nice feature — the architecture is solid: ClusterSingleton migration actor, stream-based processing with throttling, progress persistence for resume, and abort support. The separation into DocumentProcessor, MigrationStreamFactory, MigrationProgressTracker, and MigrationContext is clean.

Below are my findings, ordered by severity.

Critical Issues

1. Actor state mutation from non-actor thread

In EncryptionMigrationActor.handleMigration(), the resume path mutates actor state directly inside a thenCompose callback, which runs on a CompletionStage thread, not the actor thread:

migrationResult = progressTracker.loadProgress().thenCompose(optProgress -> {
    if (optProgress.isEmpty() || PHASE_COMPLETED.equals(optProgress.get().phase)) {
        // ...
        migrationInProgress = false;   // ⚠️ UNSAFE: not on actor thread
        currentProgress = completed;    // ⚠️ UNSAFE: not on actor thread
        sender.tell(...);
        return CompletableFuture.completedFuture(completed);
    }
    // ...
});

This violates the core Pekko actor concurrency rule. These state changes must be piped back to the actor via self.tell(), similar to how MigrationCompleted is already used for the normal completion path.

Design Concerns

2. Missing Helm chart updates

New HOCON configuration requires corresponding Helm chart updates. The following new config values have no Helm equivalents:

old-symmetrical-key / CONNECTIVITY_CONNECTION_OLD_ENCRYPTION_KEY
migration.batch-size / CONNECTIVITY_ENCRYPTION_MIGRATION_BATCH_SIZE
migration.max-documents-per-minute / CONNECTIVITY_ENCRYPTION_MIGRATION_MAX_DOCS_PER_MINUTE

3. Own MongoClient instance

EncryptionMigrationActor creates its own MongoClientWrapper rather than reusing the existing one from the service. This means an additional MongoDB connection pool per cluster node. Consider accepting a MongoClientWrapper as a constructor parameter instead.

4. Batch size default mismatch

FieldsEncryptionConfig.ConfigValue.MIGRATION_BATCH_SIZE defaults to 100 in code, but connectivity.conf overrides it to 1. The effective default is 1 document per batch, which is extremely conservative. The grouped(batchSize) in the stream means each MongoDB bulk write contains only 1 document — is this intentional?

5. `MigrationProgress` public fields

MigrationProgress exposes all fields as public final, which is atypical for the Ditto codebase. Using private fields with getters would be more consistent, or making it a record would be more idiomatic.

Behavioral Changes to Call Out

6. `/credentials/cert` added to encryption pointers

The diff adds "/credentials/cert" to the default json-pointers list in connectivity.conf. This is an independent behavioral change not mentioned in the PR description — existing connections with credentials/cert will now have that field encrypted on next snapshot write. Should this be in a separate commit or at least mentioned in the description?

Signed-off-by: Aleksandar Stanchev <[email protected]>

thjaeckle

@alstanchev I left some comments - Regarding the actor field changes outside of the actor thread we must IMO be really careful. That is something which will eventually fail - with very strange behavior ;)

thjaeckle · 2026-02-27T10:59:39Z

...a/org/eclipse/ditto/connectivity/service/messaging/persistence/EncryptionMigrationActor.java

        } else {
-            migrationResult = deleteProgress().thenCompose(v ->
+            migrationResult = progressTracker.deleteProgress().thenCompose(v ->
                    runMigration(new MigrationProgress(), oldKey, newKey, pointers, dryRun));


oldKey and newKey might be null and must therefore be specified as @Nullable in the runMigration method

thjaeckle · 2026-02-27T11:01:51Z

...a/org/eclipse/ditto/connectivity/service/messaging/persistence/EncryptionMigrationActor.java

                            ? "no previous migration found" : "previous migration already completed";
-                    LOG.info("Resume requested but {}, nothing to do", reason);
+                    log.info("Resume requested but {}, nothing to do", reason);
                    migrationInProgress = false;


Still this modifies the actor state on another thread than the actor thread - which might lead to unexpected behavior.

If you are sure this is no problem in this case, it could be acceptable - but shoul be documented then.

thjaeckle · 2026-02-27T11:01:57Z

...a/org/eclipse/ditto/connectivity/service/messaging/persistence/EncryptionMigrationActor.java

                    migrationInProgress = false;
                    final MigrationProgress completed = optProgress.orElseGet(MigrationProgress::new)
                            .withPhase(PHASE_COMPLETED);
                    currentProgress = completed;


Still this modifies the actor state on another thread than the actor thread - which might lead to unexpected behavior.

If you are sure this is no problem in this case, it could be acceptable - but shoul be documented then.

Writing migrationInProgress and currentProgress here races with handleStatus(), handleAbort(), and handleMigration() on the actor thread.

Suggestion: Pipe the "nothing to resume" result back to the actor via self.tell() with a dedicated message (similar to MigrationCompleted), and handle the state transition on the actor thread.

thjaeckle · 2026-02-27T11:07:20Z

...a/org/eclipse/ditto/connectivity/service/messaging/persistence/EncryptionMigrationActor.java


    private boolean migrationInProgress = false;
    private boolean currentDryRun = false;
    private volatile boolean abortRequested = false;


What should the volatile do in the context of an actor?

Likely the same "smell" of accessing this field not always via the actor thread, but across several threads.

This pattern screams a little for concurrency issues coming up and is concerning.

thjaeckle · 2026-02-27T11:09:30Z

...vity/service/src/main/java/org/eclipse/ditto/connectivity/service/ConnectivityRootActor.java


-        startChildActor(EncryptionMigrationActor.ACTOR_NAME,
-                EncryptionMigrationActor.props(connectivityConfig));
+        startEncryptionMigrationSingleton(actorSystem, connectivityConfig);


startEncryptionMigrationSingleton() is called unconditionally, which means every deployment — including those that never use encryption — pays the cost of:

A ClusterSingletonManager + ClusterSingletonProxy actor pair

A dedicated MongoClientWrapper with its own connection pool

thjaeckle · 2026-02-27T11:12:05Z

...vity/service/src/main/java/org/eclipse/ditto/connectivity/service/ConnectivityRootActor.java


+    private void startEncryptionMigrationSingleton(final ActorSystem actorSystem,
+            final ConnectivityConfig connectivityConfig) {
+        final MongoClientWrapper mongoClientWrapper =


This will create also its own connection pool, based on the min/max config of the service.

This is a little much, couldn't we reuse the existing MongoClientWrapper, via MongoClientExtension?

thjaeckle · 2026-02-27T12:35:03Z

deployment/helm/ditto/templates/connectivity-deployment.yaml

              value: "{{ .Values.connectivity.config.connections.kafka.producer.parallelism }}"
            - name: PEKKO_HTTP_HOSTPOOL_MAX_CONNECTION_LIFETIME
              value: "{{ .Values.connectivity.config.connections.httpPush.maxConnectionLifetime }}"
+            {{- if .Values.connectivity.config.connections.encryption.enabled }}


disable-encryption workflow won't get the old key

The encryption env vars are only set inside {{- if .Values.connectivity.config.connections.encryption.enabled }}.

But the "disable encryption" workflow requires encryption.enabled = false with an old
key set. A Helm user following the documented disable workflow would need to manually add the old key env var outside the Helm chart, which is error-prone.

alstanchev added 3 commits February 19, 2026 15:30

Add dual-key encryption support for key rotation

142eedf

Implement dual-key configuration and fallback decryption to enable safe encryption key rotation without downtime or data loss. Signed-off-by: Aleksandar Stanchev <[email protected]>

Add encryption key migration via piggyback commands

f402a27

Signed-off-by: Aleksandar Stanchev <[email protected]>

Adds documentation for rolling encrypted secrets symmetric key.

ec82e0c

Signed-off-by: Aleksandar Stanchev <[email protected]>

alstanchev requested a review from thjaeckle February 23, 2026 14:09

alstanchev force-pushed the feature/rolling-keys branch from 5abcf4d to 110624e Compare February 23, 2026 18:37

thjaeckle assigned alstanchev Feb 24, 2026

thjaeckle added the enhancement label Feb 24, 2026

thjaeckle added this to Ditto Planning Feb 24, 2026

thjaeckle added this to the 3.9.0 milestone Feb 24, 2026

alstanchev moved this to Waiting for Approval in Ditto Planning Feb 24, 2026

thjaeckle reviewed Feb 24, 2026

View reviewed changes

Some refactoring

3dcf373

Signed-off-by: Aleksandar Stanchev <[email protected]>

alstanchev force-pushed the feature/rolling-keys branch from 110624e to 3dcf373 Compare February 26, 2026 14:26

alstanchev requested a review from thjaeckle February 26, 2026 14:27

chart version

e495d12

Signed-off-by: Aleksandar Stanchev <[email protected]>

thjaeckle requested changes Feb 27, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Encryption Key Rotation Support#2350

Encryption Key Rotation Support#2350
alstanchev wants to merge 5 commits intoeclipse-ditto:masterfrom
boschglobal:feature/rolling-keys

alstanchev commented Feb 23, 2026 •

edited

Loading

Uh oh!

thjaeckle left a comment

Uh oh!

thjaeckle left a comment

Uh oh!

thjaeckle Feb 27, 2026

Uh oh!

thjaeckle Feb 27, 2026

Uh oh!

thjaeckle Feb 27, 2026

Uh oh!

thjaeckle Feb 27, 2026

Uh oh!

thjaeckle Feb 27, 2026

Uh oh!

thjaeckle Feb 27, 2026

Uh oh!

thjaeckle Feb 27, 2026

Uh oh!

thjaeckle Feb 27, 2026

Uh oh!

thjaeckle Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

alstanchev commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thjaeckle left a comment

Choose a reason for hiding this comment

PR Review: Encryption Key Rotation Support

Critical Issues

1. Actor state mutation from non-actor thread

Design Concerns

2. Missing Helm chart updates

3. Own MongoClient instance

4. Batch size default mismatch

5. MigrationProgress public fields

Behavioral Changes to Call Out

6. /credentials/cert added to encryption pointers

Uh oh!

thjaeckle left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

disable-encryption workflow won't get the old key

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

alstanchev commented Feb 23, 2026 •

edited

Loading

5. `MigrationProgress` public fields

6. `/credentials/cert` added to encryption pointers