Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 60 additions & 0 deletions samples/K8s_Ephemeral_Containers/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# Running dotnet-monitor as an Ephemeral Container in Kubernetes

Running `dotnet-monitor` as an ephemeral container lets you attach diagnostics tooling to a live .NET workload only when you need it—without permanent resource, security, or operational overhead. Instead of baking tools into each application image or running a sidecar continuously, you temporarily inject a container to collect dumps, traces, logs, metrics, or other artifacts (even from hung or crash-looping processes) and then let it disappear.

### Why use an ephemeral container?
* On-demand: No steady-state CPU/memory cost; start only for investigations.
* Lightweight images: Keep app container images free of extra tooling.
* Smaller attack surface: Elevated permissions and tooling exist for minutes, not the lifetime of the pod.
* Post-mortem access: Attach after failures or while the target process is unresponsive.
* Version independence: Use the latest `dotnet-monitor` image regardless of app version.
* Consistent workflow: Same injection procedure across all pods; no pre-provisioned sidecars.
* Cost aware: Fewer always-on containers reduces baseline resource usage.

## Prerequisites
1. Kubernetes v1.25 or newer (ephemeral containers stable).
2. Target pod created with required env vars, volume, and volume mounts. See example [template](./_dotnetmonitor.tpl).

## Inject dotnet monitor into a Pod
Prepare a [config file](config.yaml) whose values match the target's deployment as it does our [example template](./_dotnetmonitor.tpl). This step is performed once per pod lifetime; the ephemeral container persists until the pod restarts.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove the tpl file references here. You can point to our other docs regarding the app's environment/mounting configuration.


```bash
Namespace="<target pod namespace>"
Pod="<target pod>"
AppContainer="<target container app>"
ConfigFile="./config.yaml"
MonitorPort=52323

kubectl debug -n "$Namespace" "pod/$Pod" \
--image "mcr.microsoft.com/dotnet/monitor:8.0" \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider pointing to 10 since it just released.

--container "debugger" \
--target "$AppContainer" \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that target is not strictly necessary since you are connecting over volume mounts anyway.

--profile "general" \
--custom "$ConfigFile"
```

## Access the HTTP API
If you have existing [collection rules](../../documentation/api/collectionrules.md) and [egress](../../documentation/egress.md) configured, port-forwarding is optional; otherwise it enables ad-hoc requests.

```bash
kubectl port-forward -n $Namespace pod/$Pod "${MonitorPort}:${MonitorPort}"
```

## Example: Collect a GC Dump
After port-forwarding, call the [HTTP API](../../documentation/api/README.md):

```bash
ProcessId=1
ts=$(date +'%Y%m%d_%H%M%S')
file="./diagnostics/gcdump_${ProcessId}_${ts}.gcdump"
uri="http://127.0.0.1:${MonitorPort}/gcdump?pid=${ProcessId}"
echo "[INFO] Collecting GC dump for PID ${ProcessId}" >&2
mkdir -p "$(dirname "$file")"
curl -sS -H 'Accept: application/octet-stream' "$uri" -o "$file"
echo "[INFO] Saved GC dump to $file" >&2
```

## Next Steps
* Use other endpoints for traces (`/trace`), process dumps (`/dump`), or metrics.
* Configure secure [authentication](../../documentation/authentication.md).
* Automate common investigations with [collection rules](../../documentation/collectionrules/collectionrules.md) and [egress](../../documentation/egress.md) before incidents occur.
14 changes: 14 additions & 0 deletions samples/K8s_Ephemeral_Containers/_dotnetmonitor.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{{- define "acr_library.dotnet_monitor.env" -}}
- name: DOTNET_DiagnosticPorts
value: /diag/dotnet-monitor.sock,nosuspend
{{- end -}}

{{- define "acr_library.dotnet_monitor.volume" -}}
- name: diagvol
emptyDir: {}
{{- end -}}

{{- define "acr_library.dotnet_monitor.volumeMount" -}}
- name: diagvol
mountPath: /diag
{{- end -}}
12 changes: 12 additions & 0 deletions samples/K8s_Ephemeral_Containers/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"volumeMounts": [
{ "name": "diagvol", "mountPath": "/diag" },
],
"env": [
{ "name": "DotnetMonitor_Urls", "value": "http://+:52323" },
{ "name": "DotnetMonitor_DiagnosticPort__ConnectionMode", "value": "Listen" },
{ "name": "DotnetMonitor_DiagnosticPort__EndpointName", "value": "/diag/dotnet-monitor.sock" },
{ "name": "DOTNETMONITOR_Storage__DefaultSharedPath", "value": "/diag" }
],
args: [ "collect", "--no-auth" ]
}