|
| 1 | +--- |
| 2 | +id: key-state-behavior |
| 3 | +title: "Key state behavior" |
| 4 | +description: "Understand how Camunda 8 SaaS behaves when an external Amazon KMS key used for BYOK is disabled, scheduled for deletion, deleted, or misconfigured." |
| 5 | +keywords: |
| 6 | + [BYOK, encryption, KMS, key disabled, key deletion, SaaS, troubleshooting] |
| 7 | +--- |
| 8 | + |
| 9 | +Learn how Camunda 8 SaaS responds when your external Amazon KMS encryption key (BYOK) becomes unavailable during cluster startup or runtime. |
| 10 | + |
| 11 | +This page applies only to **external customer-managed keys** (AWS KMS). For encryption fundamentals, see the [encryption overview](/components/saas/byok/index.md). |
| 12 | + |
| 13 | +:::warning |
| 14 | +If your external encryption key is disabled, deleted, or its permissions are revoked, your cluster becomes inaccessible and may enter a frozen state. Camunda cannot recover encrypted data if the key is permanently deleted. |
| 15 | +::: |
| 16 | + |
| 17 | +## Key state summary |
| 18 | + |
| 19 | +The table below provides a high-level overview of how different KMS key states affect cluster startup, operation, and data availability. |
| 20 | + |
| 21 | +| Key state | Cluster startup | Cluster runtime | Data accessible? | Recovery possible? | |
| 22 | +| -------------------------- | ----------------- | --------------------------------------------- | ---------------- | ----------------------------- | |
| 23 | +| **Enabled** | ✔ Starts normally | ✔ Operates normally | Yes | Not needed | |
| 24 | +| **Disabled** | ❌ Cannot start | ❌ Freezes: no reads/writes, operations hang | No | ✔ Re-enable key | |
| 25 | +| **Scheduled for deletion** | ❌ Cannot start | ❌ Same as disabled | No | ✔ Cancel deletion + re-enable | |
| 26 | +| **Permanently deleted** | ❌ Cannot start | ❌ Cluster remains non-functional permanently | No | ❌ No — encrypted data lost | |
| 27 | +| **Incorrect key policy** | ❌ Cannot start | ❌ Behaves like disabled key | No | ✔ Fix policy | |
| 28 | + |
| 29 | +## What happens when a key becomes unavailable? |
| 30 | + |
| 31 | +When a cluster loses access to its KMS key: |
| 32 | + |
| 33 | +- All encryption/decryption requests fail immediately. |
| 34 | +- Zeebe, Elasticsearch, and backup operations **freeze**. |
| 35 | +- Console may still show the cluster as **Healthy**, even though no work can proceed. |
| 36 | +- Within ~15 minutes, the **Encryption at rest** panel displays: |
| 37 | + > **External encryption key is not ready** |
| 38 | +
|
| 39 | +### Timeline of effects |
| 40 | + |
| 41 | +1. **Immediate (0–1s):** Key becomes inaccessible. |
| 42 | +2. **Seconds:** Storage reads/writes hang. |
| 43 | +3. **Backup jobs:** Become stuck “In progress” indefinitely. |
| 44 | +4. **Suspend/Resume:** Requests appear accepted but never execute. |
| 45 | +5. **Console status:** May incorrectly continue to show “Healthy.” |
| 46 | +6. **After re-enabling:** Automatic recovery occurs, but timing depends on reconciliation and exponential backoff. |
| 47 | + |
| 48 | +## Component-level impact |
| 49 | + |
| 50 | +| Component/feature | Requires key for | Behavior when key unavailable | |
| 51 | +| ---------------------- | ------------------------------- | ----------------------------------------- | |
| 52 | +| **Zeebe brokers** | State storage (encrypted disks) | Execution freezes; no read/write activity | |
| 53 | +| **Elasticsearch** | Persistent disk encryption | Indexing and queries freeze | |
| 54 | +| **Backups** | Encrypting/decrypting snapshots | Backup requests hang forever | |
| 55 | +| **Restore operations** | Decrypting snapshots | Restore cannot proceed | |
| 56 | +| **Document storage** | Encrypting stored files | Document reads/writes freeze | |
| 57 | +| **Suspend / resume** | Changing cluster state | Request is logged but not executed | |
| 58 | + |
| 59 | +## Behavior by key lifecycle state |
| 60 | + |
| 61 | +### Disabled key |
| 62 | + |
| 63 | +- Cluster cannot start. |
| 64 | +- If cluster was running: |
| 65 | + - All operations freeze. |
| 66 | + - Suspend/resume does not complete. |
| 67 | + - Backup operations get stuck. |
| 68 | +- Console eventually shows: **External encryption key is not ready** |
| 69 | + |
| 70 | +**Recovery:** |
| 71 | +Re-enable the KMS key. Cluster resumes automatically, but recovery time increases the longer the key was disabled. |
| 72 | + |
| 73 | +### Key scheduled for deletion |
| 74 | + |
| 75 | +Scheduling deletion automatically **disables** the key. |
| 76 | + |
| 77 | +Behavior is identical to a disabled key. |
| 78 | + |
| 79 | +**Recovery:** |
| 80 | +Cancel deletion → Re-enable the key. |
| 81 | + |
| 82 | +### Permanently deleted key |
| 83 | + |
| 84 | +Once the AWS deletion waiting period passes: |
| 85 | + |
| 86 | +- The key is irrecoverable. |
| 87 | +- The cluster becomes permanently unusable. |
| 88 | +- No encrypted data can be recovered. |
| 89 | + |
| 90 | +**Recovery:** |
| 91 | +Not possible. Create a new cluster. |
| 92 | + |
| 93 | +### Incorrect or missing key policy |
| 94 | + |
| 95 | +If the KMS policy does not grant Camunda's AWS Role the required permissions: |
| 96 | + |
| 97 | +- Cluster cannot start or becomes frozen. |
| 98 | +- Behavior mirrors a disabled key. |
| 99 | + |
| 100 | +**Recovery:** |
| 101 | +Update the KMS key policy using the Tenant Role ARN displayed in Console. |
| 102 | + |
| 103 | +## Error handling and user-visible messages |
| 104 | + |
| 105 | +Currently, Camunda displays a single unified message: |
| 106 | + |
| 107 | +```yaml |
| 108 | +External encryption key is not ready |
| 109 | +``` |
| 110 | + |
| 111 | +More granular messaging is planned for future iterations. |
| 112 | + |
| 113 | +## Operator responsibilities and best practices |
| 114 | + |
| 115 | +### Customer responsibilities (BYOK model) |
| 116 | + |
| 117 | +- Maintain and monitor the key lifecycle in AWS. |
| 118 | +- Monitor for disable/delete events (CloudWatch & EventBridge). |
| 119 | +- Ensure policies remain correct. |
| 120 | +- Understand that deleting or disabling the key freezes the cluster. |
| 121 | + |
| 122 | +### Recommended monitoring |
| 123 | + |
| 124 | +Activate: |
| 125 | + |
| 126 | +- **CloudWatch alerts** for key disabled, scheduled deletion, access denied |
| 127 | +- **EventBridge** for policy changes |
| 128 | +- **CloudTrail** for Encrypt/Decrypt failures and audit logs |
| 129 | + |
| 130 | +## Related documentation |
| 131 | + |
| 132 | +- [Encryption overview](/components/saas/byok/index.md) |
| 133 | +- [External encryption setup guide](/components/saas/byok/aws-kms-setup.md) |
| 134 | +- [FAQ & troubleshooting](/components/saas/byok/faq-and-troubleshooting.md) |
| 135 | +- [Key rotation and audit logging](/components/saas/byok/key-rotation-audit-logging.md) |
0 commit comments