Skip to content

Commit 754dfb2

Browse files
authored
.NET: Add TTLs to durable agent sessions (#2679)
* .NET: Add TTLs to durable agent sessions * Remove unnecessary async * PR feedback: clarify UTC * PR feedback: limit minimum signal delay to <= 5 minutes * PR feedback: Fix TTL disablement * Linter: use auto-property * Fix build break from OpenAI SDK change * Updated CHANGELOG.md * PR feedback * Reduce default TTL to 14 days to work around DTS bug
1 parent b15466f commit 754dfb2

9 files changed

Lines changed: 583 additions & 12 deletions

File tree

Lines changed: 147 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,147 @@
1+
# Time-To-Live (TTL) for durable agent sessions
2+
3+
## Overview
4+
5+
The durable agents automatically maintain conversation history and state for each session. Without automatic cleanup, this state can accumulate indefinitely, consuming storage resources and increasing costs. The Time-To-Live (TTL) feature provides automatic cleanup of idle agent sessions, ensuring that sessions are automatically deleted after a period of inactivity.
6+
7+
## What is TTL?
8+
9+
Time-To-Live (TTL) is a configurable duration that determines how long an agent session state will be retained after its last interaction. When an agent session is idle (no messages sent to it) for longer than the TTL period, the session state is automatically deleted. Each new interaction with an agent resets the TTL timer, extending the session's lifetime.
10+
11+
## Benefits
12+
13+
- **Automatic cleanup**: No manual intervention required to clean up idle agent sessions
14+
- **Cost optimization**: Reduces storage costs by automatically removing unused session state
15+
- **Resource management**: Prevents unbounded growth of agent session state in storage
16+
- **Configurable**: Set TTL globally or per-agent type to match your application's needs
17+
18+
## Configuration
19+
20+
TTL can be configured at two levels:
21+
22+
1. **Global default TTL**: Applies to all agent sessions unless overridden
23+
2. **Per-agent type TTL**: Overrides the global default for specific agent types
24+
25+
Additionally, you can configure a **minimum deletion delay** that controls how frequently deletion operations are scheduled. The default value is 5 minutes, and the maximum allowed value is also 5 minutes.
26+
27+
> [!NOTE]
28+
> Reducing the minimum deletion delay below 5 minutes can be useful for testing or for ensuring rapid cleanup of short-lived agent sessions. However, this can also increase the load on the system and should be used with caution.
29+
30+
### Default values
31+
32+
- **Default TTL**: 14 days
33+
- **Minimum TTL deletion delay**: 5 minutes (maximum allowed value, subject to change in future releases)
34+
35+
### Configuration examples
36+
37+
#### .NET
38+
39+
```csharp
40+
// Configure global default TTL and minimum signal delay
41+
services.ConfigureDurableAgents(
42+
options =>
43+
{
44+
// Set global default TTL to 7 days
45+
options.DefaultTimeToLive = TimeSpan.FromDays(7);
46+
47+
// Add agents (will use global default TTL)
48+
options.AddAIAgent(myAgent);
49+
});
50+
51+
// Configure per-agent TTL
52+
services.ConfigureDurableAgents(
53+
options =>
54+
{
55+
options.DefaultTimeToLive = TimeSpan.FromDays(14); // Global default
56+
57+
// Agent with custom TTL of 1 day
58+
options.AddAIAgent(shortLivedAgent, timeToLive: TimeSpan.FromDays(1));
59+
60+
// Agent with custom TTL of 90 days
61+
options.AddAIAgent(longLivedAgent, timeToLive: TimeSpan.FromDays(90));
62+
63+
// Agent using global default (14 days)
64+
options.AddAIAgent(defaultAgent);
65+
});
66+
67+
// Disable TTL for specific agents by setting TTL to null
68+
services.ConfigureDurableAgents(
69+
options =>
70+
{
71+
options.DefaultTimeToLive = TimeSpan.FromDays(14);
72+
73+
// Agent with no TTL (never expires)
74+
options.AddAIAgent(permanentAgent, timeToLive: null);
75+
});
76+
```
77+
78+
## How TTL works
79+
80+
The following sections describe how TTL works in detail.
81+
82+
### Expiration tracking
83+
84+
Each agent session maintains an expiration timestamp in its internally managed state that is updated whenever the session processes a message:
85+
86+
1. When a message is sent to an agent session, the expiration time is set to `current time + TTL`
87+
2. The runtime schedules a delete operation for the expiration time (subject to minimum delay constraints)
88+
3. When the delete operation runs, if the current time is past the expiration time, the session state is deleted. Otherwise, the delete operation is rescheduled for the next expiration time.
89+
90+
### State deletion
91+
92+
When an agent session expires, its entire state is deleted, including:
93+
94+
- Conversation history
95+
- Any custom state data
96+
- Expiration timestamps
97+
98+
After deletion, if a message is sent to the same agent session, a new session is created with a fresh conversation history.
99+
100+
## Behavior examples
101+
102+
The following examples illustrate how TTL works in different scenarios.
103+
104+
### Example 1: Agent session expires after TTL
105+
106+
1. Agent configured with 30-day TTL
107+
2. User sends message at Day 0 → agent session created, expiration set to Day 30
108+
3. No further messages sent
109+
4. At Day 30 → Agent session is deleted
110+
5. User sends message at Day 31 → New agent session created with fresh conversation history
111+
112+
### Example 2: TTL reset on interaction
113+
114+
1. Agent configured with 30-day TTL
115+
2. User sends message at Day 0 → agent session created, expiration set to Day 30
116+
3. User sends message at Day 15 → Expiration reset to Day 45
117+
4. User sends message at Day 40 → Expiration reset to Day 70
118+
5. Agent session remains active as long as there are regular interactions
119+
120+
## Logging
121+
122+
The TTL feature includes comprehensive logging to track state changes:
123+
124+
- **Expiration time updated**: Logged when TTL expiration time is set or updated
125+
- **Deletion scheduled**: Logged when a deletion check signal is scheduled
126+
- **Deletion check**: Logged when a deletion check operation runs
127+
- **Session expired**: Logged when an agent session is deleted due to expiration
128+
- **TTL rescheduled**: Logged when a deletion signal is rescheduled
129+
130+
These logs help monitor TTL behavior and troubleshoot any issues.
131+
132+
## Best practices
133+
134+
1. **Choose appropriate TTL values**: Balance between storage costs and user experience. Too short TTLs may delete active sessions, while too long TTLs may accumulate unnecessary state.
135+
136+
2. **Use per-agent TTLs**: Different agents may have different usage patterns. Configure TTLs per-agent based on expected session lifetimes.
137+
138+
3. **Monitor expiration logs**: Review logs to understand TTL behavior and adjust configuration as needed.
139+
140+
4. **Test with short TTLs**: During development, use short TTLs (e.g., minutes) to verify TTL behavior without waiting for long periods.
141+
142+
## Limitations
143+
144+
- TTL is based on wall-clock time, not activity time. The expiration timer starts from the last message timestamp.
145+
- Deletion checks are durably scheduled operations and may have slight delays depending on system load.
146+
- Once an agent session is deleted, its conversation history cannot be recovered.
147+
- TTL deletion requires at least one worker to be available to process the deletion operation message.

dotnet/samples/AzureFunctions/01_SingleAgent/Program.cs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,6 @@
3232
using IHost app = FunctionsApplication
3333
.CreateBuilder(args)
3434
.ConfigureFunctionsWebApplication()
35-
.ConfigureDurableAgents(options => options.AddAIAgent(agent))
35+
.ConfigureDurableAgents(options => options.AddAIAgent(agent, timeToLive: TimeSpan.FromHours(1)))
3636
.Build();
3737
app.Run();

dotnet/src/Microsoft.Agents.AI.DurableTask/AgentEntity.cs

Lines changed: 108 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -16,29 +16,24 @@ internal class AgentEntity(IServiceProvider services, CancellationToken cancella
1616
private readonly DurableTaskClient _client = services.GetRequiredService<DurableTaskClient>();
1717
private readonly ILoggerFactory _loggerFactory = services.GetRequiredService<ILoggerFactory>();
1818
private readonly IAgentResponseHandler? _messageHandler = services.GetService<IAgentResponseHandler>();
19+
private readonly DurableAgentsOptions _options = services.GetRequiredService<DurableAgentsOptions>();
1920
private readonly CancellationToken _cancellationToken = cancellationToken != default
2021
? cancellationToken
2122
: services.GetService<IHostApplicationLifetime>()?.ApplicationStopping ?? CancellationToken.None;
2223

2324
public async Task<AgentRunResponse> RunAgentAsync(RunRequest request)
2425
{
2526
AgentSessionId sessionId = this.Context.Id;
26-
IReadOnlyDictionary<string, Func<IServiceProvider, AIAgent>> agents =
27-
this._services.GetRequiredService<IReadOnlyDictionary<string, Func<IServiceProvider, AIAgent>>>();
28-
if (!agents.TryGetValue(sessionId.Name, out Func<IServiceProvider, AIAgent>? agentFactory))
29-
{
30-
throw new InvalidOperationException($"Agent '{sessionId.Name}' not found");
31-
}
32-
33-
AIAgent agent = agentFactory(this._services);
27+
AIAgent agent = this.GetAgent(sessionId);
3428
EntityAgentWrapper agentWrapper = new(agent, this.Context, request, this._services);
3529

3630
// Logger category is Microsoft.DurableTask.Agents.{agentName}.{sessionId}
37-
ILogger logger = this._loggerFactory.CreateLogger($"Microsoft.DurableTask.Agents.{agent.Name}.{sessionId.Key}");
31+
ILogger logger = this.GetLogger(agent.Name!, sessionId.Key);
3832

3933
if (request.Messages.Count == 0)
4034
{
4135
logger.LogInformation("Ignoring empty request");
36+
return new AgentRunResponse();
4237
}
4338

4439
this.State.Data.ConversationHistory.Add(DurableAgentStateRequest.FromRunRequest(request));
@@ -113,6 +108,36 @@ async IAsyncEnumerable<AgentRunResponseUpdate> StreamResultsAsync()
113108
response.Usage?.TotalTokenCount);
114109
}
115110

111+
// Update TTL expiration time. Only schedule deletion check on first interaction.
112+
// Subsequent interactions just update the expiration time; CheckAndDeleteIfExpiredAsync
113+
// will reschedule the deletion check when it runs.
114+
TimeSpan? timeToLive = this._options.GetTimeToLive(sessionId.Name);
115+
if (timeToLive.HasValue)
116+
{
117+
DateTime newExpirationTime = DateTime.UtcNow.Add(timeToLive.Value);
118+
bool isFirstInteraction = this.State.Data.ExpirationTimeUtc is null;
119+
120+
this.State.Data.ExpirationTimeUtc = newExpirationTime;
121+
logger.LogTTLExpirationTimeUpdated(sessionId, newExpirationTime);
122+
123+
// Only schedule deletion check on the first interaction when entity is created.
124+
// On subsequent interactions, we just update the expiration time. The scheduled
125+
// CheckAndDeleteIfExpiredAsync will reschedule itself if the entity hasn't expired.
126+
if (isFirstInteraction)
127+
{
128+
this.ScheduleDeletionCheck(sessionId, logger, timeToLive.Value);
129+
}
130+
}
131+
else
132+
{
133+
// TTL is disabled. Clear the expiration time if it was previously set.
134+
if (this.State.Data.ExpirationTimeUtc.HasValue)
135+
{
136+
logger.LogTTLExpirationTimeCleared(sessionId);
137+
this.State.Data.ExpirationTimeUtc = null;
138+
}
139+
}
140+
116141
return response;
117142
}
118143
finally
@@ -121,4 +146,78 @@ async IAsyncEnumerable<AgentRunResponseUpdate> StreamResultsAsync()
121146
DurableAgentContext.ClearCurrent();
122147
}
123148
}
149+
150+
/// <summary>
151+
/// Checks if the entity has expired and deletes it if so, otherwise reschedules the deletion check.
152+
/// </summary>
153+
/// <remarks>
154+
/// This method is called by the durable task runtime when a <c>CheckAndDeleteIfExpired</c> signal is received.
155+
/// </remarks>
156+
public void CheckAndDeleteIfExpired()
157+
{
158+
AgentSessionId sessionId = this.Context.Id;
159+
AIAgent agent = this.GetAgent(sessionId);
160+
ILogger logger = this.GetLogger(agent.Name!, sessionId.Key);
161+
162+
DateTime currentTime = DateTime.UtcNow;
163+
DateTime? expirationTime = this.State.Data.ExpirationTimeUtc;
164+
165+
logger.LogTTLDeletionCheck(sessionId, expirationTime, currentTime);
166+
167+
if (expirationTime.HasValue)
168+
{
169+
if (currentTime >= expirationTime.Value)
170+
{
171+
// Entity has expired, delete it
172+
logger.LogTTLEntityExpired(sessionId, expirationTime.Value);
173+
this.State = null!;
174+
}
175+
else
176+
{
177+
// Entity hasn't expired yet, reschedule the deletion check
178+
TimeSpan? timeToLive = this._options.GetTimeToLive(sessionId.Name);
179+
if (timeToLive.HasValue)
180+
{
181+
this.ScheduleDeletionCheck(sessionId, logger, timeToLive.Value);
182+
}
183+
}
184+
}
185+
}
186+
187+
private void ScheduleDeletionCheck(AgentSessionId sessionId, ILogger logger, TimeSpan timeToLive)
188+
{
189+
DateTime currentTime = DateTime.UtcNow;
190+
DateTime expirationTime = this.State.Data.ExpirationTimeUtc ?? currentTime.Add(timeToLive);
191+
TimeSpan minimumDelay = this._options.MinimumTimeToLiveSignalDelay;
192+
193+
// To avoid excessive scheduling, we schedule the deletion check for no less than the minimum delay.
194+
DateTime scheduledTime = expirationTime > currentTime.Add(minimumDelay)
195+
? expirationTime
196+
: currentTime.Add(minimumDelay);
197+
198+
logger.LogTTLDeletionScheduled(sessionId, scheduledTime);
199+
200+
// Schedule a signal to self to check for expiration
201+
this.Context.SignalEntity(
202+
this.Context.Id,
203+
nameof(CheckAndDeleteIfExpired), // self-signal
204+
options: new SignalEntityOptions { SignalTime = scheduledTime });
205+
}
206+
207+
private AIAgent GetAgent(AgentSessionId sessionId)
208+
{
209+
IReadOnlyDictionary<string, Func<IServiceProvider, AIAgent>> agents =
210+
this._services.GetRequiredService<IReadOnlyDictionary<string, Func<IServiceProvider, AIAgent>>>();
211+
if (!agents.TryGetValue(sessionId.Name, out Func<IServiceProvider, AIAgent>? agentFactory))
212+
{
213+
throw new InvalidOperationException($"Agent '{sessionId.Name}' not found");
214+
}
215+
216+
return agentFactory(this._services);
217+
}
218+
219+
private ILogger GetLogger(string agentName, string sessionKey)
220+
{
221+
return this._loggerFactory.CreateLogger($"Microsoft.DurableTask.Agents.{agentName}.{sessionKey}");
222+
}
124223
}

dotnet/src/Microsoft.Agents.AI.DurableTask/CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,9 @@
11
# Release History
22

3+
## [Unreleased]
4+
5+
- Added TTL configuration for durable agent entities ([#2679](https://github.com/microsoft/agent-framework/pull/2679))
6+
37
## v1.0.0-preview.251204.1
48

59
- Added orchestration ID to durable agent entity state ([#2137](https://github.com/microsoft/agent-framework/pull/2137))

0 commit comments

Comments
 (0)