Temporal validity in retrieval — how do you handle facts that go stale? #21359

DomLynch · 2026-04-11T06:26:59Z

DomLynch
Apr 11, 2026

A practical question about retrieval design for long-running agents.

Most RAG setups treat stored knowledge as static — you embed it once, retrieve by similarity, done. But agents that run across weeks or months accumulate facts that expire or contradict later information. A few scenarios:

User says "my budget is £5k" in January, then "budget increased to £10k" in March. Both are in the index. Which surfaces on retrieval?
A stored fact references a product version that no longer exists. No explicit signal it is outdated — the embedding still matches.
Seasonal or time-sensitive context ("I am preparing for a conference next week") that is meaningless three weeks later.

Curious how people are handling this in practice with LlamaIndex. A few approaches I have seen:

Timestamp metadata + recency bias in scoring
Explicit invalidation on write (overwrite old nodes)
Graph-based approaches where newer edges supersede older ones

Is there a recommended pattern here, or is this mostly left to the application layer? Would be useful to understand what primitives LlamaIndex exposes that are relevant.

ibondarenko1 · 2026-05-05T22:13:39Z

ibondarenko1
May 5, 2026

Practical patterns from building agents that run across months on top of LlamaIndex - temporal validity is mostly application-layer in LlamaIndex today, but the index gives you enough primitives to build it cleanly without forking.

1. Treat each fact as a node with valid_from / valid_to / superseded_by metadata, not a free-form blob.

from llama_index.core.schema import TextNode
node = TextNode(
    text="user budget is GBP 10k",
    metadata={
        "fact_key": "user.budget",     # canonical fact identity
        "valid_from": "2026-03-15",
        "valid_to": None,              # open-ended until superseded
        "superseded_by": None,
        "source_msg_id": "msg_4242",
    },
)

The fact_key is the load-bearing field. It is what lets you write "supersede the previous budget statement" rather than just appending another node and hoping the embedding ranks newer-first.

2. On write, run an explicit invalidation pass before insert.

def upsert_fact(index, new_node):
    key = new_node.metadata["fact_key"]
    today = new_node.metadata["valid_from"]
    # find any open-ended prior fact under the same key
    prior = index.vector_store.query(MetadataFilters(filters=[
        ExactMatchFilter(key="fact_key", value=key),
        ExactMatchFilter(key="valid_to", value=None),
    ]))
    for p in prior.nodes:
        p.metadata["valid_to"] = today
        p.metadata["superseded_by"] = new_node.node_id
        index.update_ref_doc(p)
    index.insert(new_node)

Embedding-based "newer wins" alone is unreliable - embeddings of "GBP 5k" and "GBP 10k" sit in roughly the same neighbourhood, so retrieval ties on similarity and recency-bias has to break the tie. The explicit superseded_by chain makes this deterministic.

3. Wrap retrieval in a MetadataPostprocessor that filters by as_of time.

from llama_index.core.postprocessor.types import BaseNodePostprocessor

class TemporalFilter(BaseNodePostprocessor):
    as_of: str
    def _postprocess_nodes(self, nodes, query_bundle):
        return [n for n in nodes
                if n.node.metadata.get("valid_from","0") <= self.as_of
                and (n.node.metadata.get("valid_to") is None
                     or n.node.metadata["valid_to"] > self.as_of)]

retriever = index.as_retriever(similarity_top_k=20,
    node_postprocessors=[TemporalFilter(as_of=today)])

Filtering after retrieval (not before) keeps it cheap on small top-K and avoids fighting the vector-store filter API for ranged queries (which most stores expose unevenly).

4. For session-scoped facts ("preparing for a conference next week"), attach a TTL and a separate index you blow away on session-end.

Mixing ephemeral context into the same VectorStoreIndex as long-lived facts is what makes stale retrieval feel inevitable. Two indices, retrieved in parallel and merged at the postprocessor stage, gives clean GC: drop the session index when the session ends.

5. Recency bias is a tie-breaker, not a ranking strategy.

If you weight recency_score = exp(-age_days / half_life) and add it to similarity, you end up over-favouring recent typos and chitchat. Use it only as a sort key when two nodes share the same fact_key after the supersedes-chain has been resolved. The chain does the truth filter; recency just resolves ambiguity within a fact key.

Graph approach (your third bullet) works well for relational facts ("user.preferences.colour=blue"), less well for statement facts ("user said X on date Y"). PropertyGraphIndex is good if your facts naturally form a graph; KeywordTableIndex + the supersedes-chain above is simpler if they don't.

Recipe: fact_key-keyed nodes -> explicit invalidation on upsert -> temporal postprocessor at query time -> separate session-scoped index for ephemeral context -> recency only as a tie-breaker within a fact key.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Temporal validity in retrieval — how do you handle facts that go stale? #21359

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Temporal validity in retrieval — how do you handle facts that go stale? #21359

Uh oh!

DomLynch Apr 11, 2026

Replies: 1 comment

Uh oh!

ibondarenko1 May 5, 2026

DomLynch
Apr 11, 2026

ibondarenko1
May 5, 2026