Write Pipeline¶

When you call memory.write(), the SDK reads a natural language message and automatically extracts who and what was mentioned, figures out if it's new or updated information, and stores it as structured, versioned facts - all in one call.

You don't need to understand the internals to use it. Just call write() and check result.facts_added. This page explains what happens under the hood for when you want to tune behavior or debug results.

flowchart LR
    A["Message"] --> B["Alias Lookup +\nPre-retrieval"]
    B --> C["Informed\nExtraction"]
    C --> D["Resolve Entities"]
    D --> E["Upsert"]
    E --> F["WriteResult"]

Overview¶

Every memory.write(agent_id, message, speaker_name) call runs these steps (see also the optional occurred_at parameter in Configuration):

Guard - Empty messages return immediately. No event, no LLM call, no tokens consumed.
Log the event - The raw message is saved as an immutable audit trail (never modified or deleted).
Detect emotion - Classifies the message's emotion, intensity, and energy level.
Alias lookup - Scans the message for known entity names using word-boundary matching against the alias cache (no LLM call).
Pre-retrieval / Profile load - For each recognized entity, fetches existing knowledge: entity profiles (if available) or top-K existing facts via pgvector embedding similarity (no LLM call).
Informed extraction - A single LLM call receives the message, speaker context, and existing knowledge context. Returns entities, facts (with action NEW/UPDATE and importance category), and relations in one JSON response.
Resolve entities - Deduplicates mentions ("Ana", "my wife Ana", "Aninha") into a single canonical entity.
Upsert - Saves facts, entity links, relationships, and updated entity profiles to the database.

Each stage is independently fail-safe: if informed extraction fails, the pipeline falls back to the legacy blind extraction + reconciliation flow. If upsert fails for one fact, the others proceed normally.

Legacy fallback

If informed extraction fails (LLM timeout, invalid JSON, etc.), the pipeline automatically falls back to the old flow: entity scan (1 LLM call) + fact extraction + relation extraction (2 concurrent LLM calls) + reconciliation (1 LLM call per ambiguous fact). This ensures no data is lost even when the new path encounters an error.

Stage 1: Memory-Aware Extraction¶

In plain English: Before calling the LLM, the SDK checks what it already knows about the entities in the message. It loads existing entity profiles or recent facts, then sends everything to the LLM in a single call: "Here's the message, here's the speaker, here's when it was sent, and here's what we already know. Extract all the facts." The LLM returns structured data -- entities, facts, relations, and updated profiles -- and the system handles deduplication downstream.

The extraction stage uses a memory-aware approach (informed extraction) by default. Instead of extracting blindly, the pipeline first gathers existing knowledge and passes it as context, so the LLM can extract with full awareness of what is already stored. The LLM is instructed to extract ALL factual information from the message and to copy specific terms verbatim — proper nouns, titles, place names, dates, numbers, and named objects must appear in the extracted fact exactly as they appear in the message, never substituted with generic category words ("the book", "the place", "the instrument"). Enumerations must keep all items, not just the first. Deduplication is handled downstream by the reconciliation step (ADD/UPDATE/NOOP/DELETE decisions), not by asking the LLM to filter. This approach maximizes recall AND fidelity -- the LLM captures everything without diluting the specific details that make facts useful at retrieval time.

How It Works (Informed Extraction -- Default)¶

Informed extraction runs 1 LLM call, preceded by two zero-cost lookup steps:

Alias lookup (no LLM) -- Scans the message for known entity names using word-boundary matching against the alias cache (MemoryEntityAlias). This identifies which entities the message is about before any LLM call.
Pre-retrieval / Profile load (no LLM) -- For each recognized entity, loads context:
- If the entity has a profile_text (see Entity Profiles below), the profile is used as context.
- Otherwise, fetches the top-K existing facts for that entity via pgvector embedding similarity.
- The total context is capped at informed_extraction_context_budget_tokens to avoid prompt bloat.
Informed extraction (1 LLM call) -- The LLM receives the message, speaker context, temporal context (the occurred_at timestamp or current time, used to resolve relative references like "yesterday" or "last week"), and existing knowledge context. It returns a single JSON with:
- Entities -- with aliases, as before.
- Facts -- each annotated with an action (NEW or UPDATE) and an importance_category.
- Relations -- between entities.
- Updated profiles -- concise entity summaries reflecting the new information (see Entity Profiles).

The LLM is instructed to extract ALL factual information from the message, preserving specific terms verbatim. The prompt explicitly forbids substituting proper nouns, titles, place names, dates, numbers, or named objects with category words (e.g., a fact about "stained glass window" cannot become "artwork"; "clarinet and violin" cannot become "instruments" or just "clarinet"). Facts marked UPDATE indicate a change to existing knowledge; facts marked NEW are genuinely novel information. The reconciliation step (ADD/UPDATE/NOOP/DELETE) handles deduplication downstream, ensuring nothing is lost even when the LLM re-extracts something already known.

Importance categories: Each fact receives a semantic importance_category from the LLM, which is mapped to a numeric importance value via the IMPORTANCE_CATEGORY_MAP:

Category	Importance	Example
`biographical_milestone`	High	"Graduated from MIT", "Got married"
`relationship_change`	High	"Started dating Ana", "Left Acme Corp"
`stable_preference`	Medium	"Prefers Python over Java"
`specific_event`	Medium	"Went to a concert last Friday"
`routine_activity`	Low	"Goes to the gym on Mondays"
`conversational`	Low	"Said they're tired today"

This replaces the flat 0.5 default importance: facts are born with a semantically grounded importance score instead of being uniformly scored and waiting for background jobs to adjust them.

If informed extraction fails (LLM timeout, invalid JSON, rate limit), the pipeline falls back to the legacy flow:

Entity scan (1 LLM call) -- Identify all entities mentioned in the message
Fact extraction + Relation extraction (2 concurrent LLM calls) -- via asyncio.gather()
Reconciliation (see Stage 3) -- Compares each fact against existing knowledge

This fallback ensures no data is lost. The event is still logged, and the legacy path produces the same end result -- just with more LLM calls and without the duplicate-elimination benefits of informed extraction.

Relation extraction in the fallback includes an automatic retry: if the LLM returns 0 relations but 2+ entities were found, the SDK retries the relation call once before accepting an empty result. When verbose=True, the trace includes relation_retry_triggered.

Subject-centric extraction: Facts are extracted from the perspective of the primary subject only. "Carlos lives in Curitiba" is a fact about Carlos - the system does NOT also create "Curitiba is where Carlos lives" as a separate fact. The relationship Carlos → lives_in → Curitiba + entity links handle cross-entity retrieval.

Semantic dedup: After extraction (both informed and fallback), facts are compared pairwise by embedding cosine similarity. Near-duplicates (> 0.85 similarity) are removed, keeping the first occurrence. This eliminates cross-entity reformulations that the LLM sometimes produces despite prompt instructions.

Configuration¶

Parameter	Default	Description
`extraction_timeout_sec`	`30.0`	Timeout per LLM call
`enable_informed_extraction`	`True`	Enable memory-aware informed extraction. When `False`, always uses the legacy blind extraction + reconciliation flow
`informed_extraction_topk`	`10`	Number of existing facts to retrieve per entity during pre-retrieval (when no entity profile is available)
`informed_extraction_context_budget_tokens`	`800`	Maximum token budget for the existing knowledge context passed to the informed extraction LLM call

What Gets Extracted¶

For each message, the extraction stage produces:

Entities - Named things: people, organizations, places, concepts, etc.
Facts - Self-contained statements about entities in natural language (e.g., "Fernanda Lima is a software engineer", "Marcos Tavares lives in Porto Alegre"). Each fact text always includes the entity name - never just "is a software engineer" without a subject. Every relationship also generates a corresponding fact - so "Sarah is my wife" produces both a relation (user → spouse_of → sarah) and a fact ("Sarah is user's wife"). Duplicate facts (same subject + same text, ignoring punctuation) are automatically removed post-extraction.
Relations - Connections between entities (e.g., "Rafael" → works_at → "Acme Corp"). Relations serve as graph edges for traversal; the paired fact makes the information searchable via text/embedding.
Updated profiles (informed extraction only) - Concise entity summaries reflecting the new information. See Entity Profiles.

Each fact includes a confidence level:

Level	Score	Example
Explicit statement	0.95	"I live in São Paulo"
Strong inference	0.80	"We went to the São Paulo office" (implies location)
Weak inference	0.60	Contextual implication
Speculation	0.40	Uncertain information

How confidence works in practice

Confidence is assigned by the LLM during extraction based on how the information was stated. Direct statements ("I live in SP") get high confidence; hedged statements ("I think maybe...") get lower confidence. You cannot set confidence directly - it's inferred. You can filter low-confidence facts at retrieval time using min_confidence in MemoryConfig (default 0.55).

Walkthrough: how informed extraction works

Message: "Clara Rezende saiu da Vertix e foi pra Orion Tech como head de engenharia. O Thiago Nogueira a contratou pessoalmente."

Step 1 - Alias lookup (no LLM):

Scans message against alias cache.
Matches: "Clara Rezende" → person:clara_rezende, "Vertix" → organization:vertix
No match: "Orion Tech", "Thiago Nogueira" (new entities)

Step 2 - Pre-retrieval / Profile load (no LLM):

person:clara_rezende has profile_text → loaded as context
organization:vertix has no profile → top-10 facts fetched via pgvector
Existing context:
  [Clara Rezende profile] "Software engineer, previously at Vertix since 2023."
  [Vertix fact] "Vertix is a SaaS startup in Curitiba" (0.95)

Step 3 - Informed extraction (1 LLM call):

LLM receives: message + speaker context + existing knowledge
Returns:
  Entities: [Clara Rezende (person), Vertix (organization), Orion Tech (organization), Thiago Nogueira (person)]
  Facts:
    [Clara Rezende] "Clara Rezende left Vertix" (0.95, action=NEW, category=relationship_change)
    [Clara Rezende] "Clara Rezende joined Orion Tech as head of engineering" (0.95, action=NEW, category=biographical_milestone)
    [Thiago Nogueira] "Thiago Nogueira personally hired Clara Rezende" (0.95, action=NEW, category=specific_event)
  Relations:
    Clara Rezende → former_employee_of → Vertix
    Clara Rezende → works_at → Orion Tech
    Thiago Nogueira → hired → Clara Rezende
  Updated profiles:
    Clara Rezende → "Software engineer. Left Vertix, joined Orion Tech as head of engineering."

Note: "Clara Rezende is a software engineer" was NOT re-extracted because the LLM saw it in the existing profile.

Step 4 - Semantic dedup: No near-duplicates found (all facts are distinct). 3 facts pass through.

Result: 4 entities, 3 facts, 3 relations, 1 updated profile. Total: 1 LLM call (vs 3 in legacy mode).

Walkthrough: how fallback (blind) extraction works

If informed extraction fails (e.g., LLM returns invalid JSON), the pipeline falls back to the legacy flow:

Step 1 - Entity scan (1 LLM call):

Entities: [Clara Rezende (person), Vertix (organization), Orion Tech (organization), Thiago Nogueira (person)]

Step 2a - Fact extraction (1 LLM call, all entities):

Facts:
  [Clara Rezende] "Clara Rezende left Vertix" (0.95)
  [Clara Rezende] "Clara Rezende joined Orion Tech as head of engineering" (0.95)
  [Thiago Nogueira] "Thiago Nogueira personally hired Clara Rezende" (0.95)

Step 2b - Relation extraction (1 LLM call, concurrent with 2a):

Relations:
  Clara Rezende → former_employee_of → Vertix
  Clara Rezende → works_at → Orion Tech
  Thiago Nogueira → hired → Clara Rezende

Step 3 - Semantic dedup: No near-duplicates found (all facts are distinct). 3 facts pass through.

Result: 4 entities, 3 facts, 3 relations. Total: 3 LLM calls. Then proceeds to reconciliation.

Alias Grouping & Subject Normalization¶

When the same entity is mentioned by multiple names in a single message (e.g., "my friend Guili (Guilherme Maturana)"), the extraction groups them into a single entity with aliases instead of creating duplicates.

The LLM is instructed to pick one canonical name (usually the most complete) and list the others as aliases:

{
  "entities": [
    {"name": "Guilherme Maturana", "type": "person", "aliases": ["Guili"]}
  ]
}

After extraction, a subject normalization pass rewrites any fact or relation that references an alias to use the canonical name instead. Identity relations (e.g., same_as between an alias and its canonical name) are removed automatically since they become self-referencing after normalization.

This eliminates intra-message entity duplication at the source - before entity resolution even runs.

Entity Types¶

Entity types are free-form strings - the LLM chooses the most appropriate type for each entity. Common types include person, organization, place, product, event, concept, pet, but any descriptive type is accepted. Types are normalized to lowercase during entity resolution (e.g., "Person" → "person", "PRODUCT" → "product").

The extraction prompt instructs the LLM to classify types carefully - for example, cities are place, companies are organization, software products are product.

Language of Generated Content¶

All LLM-generated content -- entity profiles, summaries, cluster summaries, meta-observations, and procedural directives -- is always produced in English, regardless of the language of the input message. This ensures consistency in internal representations across multilingual conversations.

Facts remain in the original conversation language. A message in Portuguese produces fact texts in Portuguese (e.g., "Pedro mora em Porto Alegre"), but the entity profile for Pedro will be written in English (e.g., "Software engineer based in Porto Alegre, married to Ana."). This is by design: facts are verbatim extractions from the conversation, while generated content is internal metadata that benefits from a single canonical language.

Fail-safe Behavior¶

If an LLM call fails (timeout, invalid JSON, rate limit), the extraction returns an empty result rather than raising an exception. The event is still logged - no data is lost. The next message may capture the same information.

Detecting timeouts

When extraction times out, the result is indistinguishable from "message had no extractable content" - 0 entities, 0 facts, no exception. To detect timeouts, compare the extraction duration_ms in the trace against your configured extraction_timeout_sec, or check for 0 entities despite a content-rich message.

Neuroscience parallel

Informed extraction mirrors encoding relative to prior knowledge in human memory. We don't encode new experiences in a vacuum -- the brain's orienting response compares incoming stimuli against existing schemas before committing anything to long-term storage. When you hear something you already know, your hippocampus suppresses re-encoding (repetition suppression). When you hear something genuinely new or contradictory, encoding is enhanced (the novelty/mismatch signal). Informed extraction replicates this: the LLM sees what's already known and only extracts what's new or changed.

Entity Profiles¶

Entity profiles are concise summaries (~100-300 tokens) that capture what the system knows about an entity. Each entity's profile_text is both input and output of the informed extraction stage, creating a feedback loop that keeps profiles current.

How Profiles Work¶

Input to extraction: When the informed extraction runs, entity profiles are loaded during the pre-retrieval step and injected into the LLM context. When a profile is available, it replaces the individual fact retrieval for that entity -- a single concise summary instead of N separate facts, saving prompt tokens and providing better context.
Output from extraction: The LLM returns updated_profiles as part of its response. These reflect the entity's state after incorporating the new information from the message.
Persistence: Updated profiles are saved to the memory_entities table (profile_text column) in the same database transaction as the facts and relations. The profile_refreshed_at timestamp is updated to track freshness.
Cold start / Seeding: On the first message about an entity, there is no pre-existing profile. The informed extraction creates an initial (seed) profile from the facts it extracts. The write pipeline only seeds profiles for entities that do not already have one -- it never overwrites an existing profile. The authoritative source for comprehensive, up-to-date profiles is the consolidate_entity_profiles() background job, which reads ALL facts per entity and generates a thorough profile covering every major aspect.

Profiles vs Summaries¶

Entity profiles (profile_text) and entity summaries (summary_text) coexist but serve different purposes:

	Profile (`profile_text`)	Summary (`summary_text`)
When updated	During the write pipeline (synchronous)	By background jobs (asynchronous)
Scope	Concise, ~100-300 tokens	More comprehensive
Used by	Informed extraction (write pipeline only)	Background jobs, importance scoring
Freshness	Always reflects the latest write	May lag behind recent writes

Profiles and Retrieval¶

Entity profiles are internal to the write pipeline only. They are used as context for informed extraction (so the LLM knows what the system already knows about an entity), but they are NOT injected into retrieval output. The read pipeline formats facts, meta-observations, and events directly -- profiles do not appear in the context string returned by retrieve().

Stage 2: Entity Resolution¶

In plain English: When someone says "Ana", "my wife Ana", and "Aninha" in different messages, they're all talking about the same person. This stage figures that out and links everything to one canonical entity - so you don't end up with three separate "Ana" records in the database.

Three-Phase Resolution¶

flowchart LR
    A["Entity name"] --> B{"Exact match?"}
    B -->|Yes| F["Resolved"]
    B -->|No| C{"Fuzzy match?"}
    C -->|"≥ 0.85"| F
    C -->|"0.50–0.85"| D{"LLM decides"}
    C -->|"< 0.50"| E["Create new entity"]
    D -->|Match| F
    D -->|No match| E
    E --> F

Phase 1: Exact match

Checks the alias cache, entity slugs, and display names. Instant, no LLM call.

Includes prefix/diminutive matching for person entities: "Carol" matches "Carolina" (minimum 3 characters). Note: "Jo" will NOT match "João" (< 3 chars). "Bob" will match "Roberto" only if registered as an alias, not via prefix matching.

Phase 2: Fuzzy match

Uses embedding cosine similarity (in-memory) to find candidates:

≥ fuzzy_threshold (default 0.85) - High confidence match, resolves directly
0.50 - fuzzy_threshold - Ambiguous, forwards top-3 candidates to Phase 3 (LLM)
< 0.50 - No match, creates a new entity

Lowering fuzzy_threshold expands the fuzzy-resolve range and reduces LLM calls. For example, setting fuzzy_threshold=0.50 eliminates the ambiguous range entirely - everything above 0.50 resolves directly.

Falls back to difflib.SequenceMatcher when embeddings are unavailable.

Phase 3: LLM fallback

Sends ambiguous candidates to the injected LLMProvider for disambiguation. The LLM sees the entity name, the candidates, and decides which (if any) is a match.

Walkthrough: how entity resolution works

Message: "Talked to Guili about the project. Guilherme said it's on track."

Extract: Two names found: "Guili" and "Guilherme"
Phase 1 (exact): "Guilherme" matches existing entity person:guilherme_maturana
Phase 2 (fuzzy): "Guili" has 0.87 cosine similarity with "Guilherme" → auto-resolves
Result: Both names resolve to the same entity. Alias "Guili" registered.

Next time "Guili" appears, Phase 1 catches it instantly via the alias cache - no fuzzy or LLM call needed.

Special Cases¶

Speaker pronouns - "I", "me", "eu", "myself" automatically resolve to the speaker entity (person:{speaker_slug}). For example, if speaker_name="Rafael", these pronouns resolve to person:rafael.
Relationship terms - "girlfriend", "brother", "amigo" resolve to the speaker entity when the bare word is the entity name (not "my girlfriend Ana" - there "Ana" is the entity). The match triggers when the entity name itself is a relationship term, not the full phrase.
Relational hints - "Carol (Rafael's girlfriend)" strips the hint and forces type="person"

Alias Registration¶

When a new alias is discovered (e.g., "Aninha" resolves to person:ana), it's registered in MemoryEntityAlias with first-write-wins semantics - concurrent writes won't create conflicting aliases. Aliases are scoped per agent_id: the same alias can map to different entities for different agents.

Extraction-provided aliases are also registered automatically: when entity resolution creates a new entity that has aliases from extraction (e.g., "Guili" for "Guilherme Maturana"), all aliases are registered in MemoryEntityAlias and added to the in-memory alias cache. This means subsequent entities in the same batch can immediately resolve via Phase 1 exact match - no fuzzy or LLM calls needed.

This creates a two-line defense against duplicates:

Intra-message - Alias grouping in extraction prevents duplicates within a single message
Cross-message - Registered aliases enable exact match in future messages (e.g., if message 1 creates "Guilherme Maturana" with alias "Guili", message 2 mentioning "Guili" resolves instantly via Phase 1)

Entity Persistence¶

After entity resolution completes, the pipeline ensures every resolved entity has a row in the memory_entities table - not just newly created ones. Entities resolved via exact match, fuzzy match, or LLM disambiguation are also upserted using ON CONFLICT DO UPDATE (idempotent).

This is critical because background jobs (importance scoring, summary refresh, spreading activation) read from memory_entities. Without a row, these jobs are blind to the entity and can't operate on it.

The entity upsert is fail-safe: if one entity fails to persist (e.g., constraint violation), the others proceed normally and the pipeline continues.

Configuration¶

Parameter	Default	Description
`fuzzy_threshold`	`0.85`	Cosine similarity threshold for direct fuzzy match
`enable_llm_resolution`	`True`	Whether to use LLM for ambiguous cases. When `False`, ambiguous candidates create a new entity instead of calling LLM.

Model selection

The LLM model used for entity resolution is determined by the LLMProvider you inject into MemoryClient. To use a different model for resolution vs. extraction, inject different providers.

Neuroscience parallel

Entity resolution mirrors associative memory - the brain's ability to link new stimuli to existing representations. Hearing "Carol" activates the neural pattern for "Carolina" through pattern completion, just as fuzzy matching activates candidate entities through embedding similarity.

Stage 3: Reconciliation¶

In plain English: If the user said "I live in Sao Paulo" last week and now says "I moved to Rio", the system needs to figure out that this is an update, not a second home. This stage compares each new fact against what's already stored and decides: is this new info? An update to something existing? Already known? Or a retraction?

Reconciliation always runs

Reconciliation runs unconditionally, regardless of whether extraction was informed or blind. Informed extraction provides richer context that improves fact quality, but the decision of what to do with each fact (ADD, UPDATE, NOOP, DELETE) is always made by reconcile_facts(). This ensures contradictory facts are detected and invalidated.

Decision Logic¶

For each extracted fact, the reconciler:

Fetches existing facts for the same entity
Computes similarity between the new fact and each existing fact (via embeddings)
Decides the action:

Action	When	Example
ADD	New information, no similar existing fact (similarity < 0.50)	"speaks French" when no language fact exists
UPDATE	Supersedes an existing fact (similarity 0.50 - 0.85+)	"lives in Rio" supersedes "lives in São Paulo"
NOOP	Already known (high similarity)	"works at Acme" when this fact already exists
DELETE	Explicitly retracts a fact	"I no longer work at Acme"

Walkthrough: how reconciliation decides

Scenario: User previously said "Ricardo lives in São Paulo". Now says "Ricardo moved to Austin, Texas."

Step 1 - Fetch existing facts for Ricardo:

Existing: "Ricardo Gomes lives in São Paulo" (confidence: 0.95, active)

Step 2 - Compute similarity:

New fact: "Ricardo Gomes moved to Austin, Texas"
vs existing: "Ricardo Gomes lives in São Paulo"
Cosine similarity: 0.72 (both about Ricardo's location)

Step 3 - Similarity ≥ 0.50 → slow path (LLM call): The LLM sees both facts and decides: this is an UPDATE. The user moved.

Result:

Old fact: "Ricardo Gomes lives in São Paulo" → valid_to = now, invalidated_at = now
New fact: "Ricardo Gomes moved to Austin, Texas" → supersedes_fact_id = old_fact.id
Relationship: ricardo → lives_in → sao_paulo → INVALIDATED (cascade)
New relationship: ricardo → lives_in → austin

If the similarity had been < 0.50 (e.g., "Ricardo likes jazz"), it would auto-ADD without an LLM call - fast path.

Reconciliation Performance¶

Fast path (similarity < 0.50): Auto-ADD without LLM call (~300ms). This is the common path for novel information.
Slow path (similarity ≥ 0.50): LLM evaluates whether to ADD, UPDATE, DELETE, or NOOP (~2-3s). This requires an LLM call with full context.

Plan accordingly: bulk imports of new data are fast; updates to existing knowledge require LLM decision-making.

UPDATE chains may branch

The reconciliation LLM may choose ADD over UPDATE when it interprets new information as distinct rather than a replacement. For example, "I moved to BH" might create separate facts for "lives in BH" and "used to live in RJ" instead of a simple update chain. This preserves more information but may break the supersedes_fact_id chain. This is expected behavior - the LLM prioritizes information preservation.

Fail-safe Behavior¶

If the reconciliation LLM call fails, the system defaults to ADD - it's better to have a near-duplicate than to lose information. The background consolidation jobs (clustering, deduplication) clean up duplicates later.

Fact Versioning¶

Facts are versioned using temporal validity windows (valid_from, valid_to):

Active facts have valid_to = NULL
Updated facts get both valid_to and invalidated_at set, and a new fact is created with supersedes_fact_id pointing to the old one
Deleted facts get both valid_to and invalidated_at set

This enables time-travel queries: you can ask what the system knew at any point in time.

Neuroscience parallel

Reconciliation mirrors reconsolidation - the process by which retrieved memories become labile and can be modified. When you recall a memory ("lives in São Paulo") and encounter new information ("just moved to Rio"), the original memory is updated. The brain doesn't simply overwrite - it creates a new trace linked to the original, just as UPDATE creates a new fact with supersedes_fact_id.

Stage 4: Upsert¶

In plain English: This is where the decisions from the previous stage are actually saved to the database. New facts are inserted, outdated facts are marked as superseded, and relationships between entities are created or strengthened. Everything runs inside a transaction - if one fact fails to save, the others still go through.

Decision	Database action
ADD	Create new `MemoryFact` with embedding
UPDATE	Close old fact (`valid_to = now`), create new one with `supersedes_fact_id`
NOOP	Update `last_confirmed_at` on existing fact
DELETE	Close fact (`valid_to = now`, `invalidated_at = now`)

Fact-Entity Links¶

After each fact is persisted (ADD or UPDATE), the pipeline creates entity links connecting the fact to every entity it mentions - not just its primary subject. This enables cross-entity retrieval without duplicating facts.

For example, "Clara Rezende left Vertix" is stored once with entity_key = person:clara_rezende (primary subject). But entity links are created for both person:clara_rezende (primary) and organization:vertix (secondary). When you query about Vertix, the system finds this fact via the link - no duplicate fact needed.

Links are created by matching entity display names against the fact text (case-insensitive substring match). Very short names (< 3 characters) are skipped to avoid false positives. Link creation is fail-safe: if it fails, the fact persists normally.

Walkthrough: how upsert + entity links work

Fact to persist: "Clara Rezende joined Orion Tech as head of engineering" (decision: ADD)

Step 1 - Create MemoryFact:

id: fact_abc123
entity_key: person:clara_rezende (primary subject)
fact_text: "Clara Rezende joined Orion Tech as head of engineering"
confidence: 0.95
valid_from: now
valid_to: NULL

Step 2 - Create entity links:

Entity map has: {Clara Rezende → person:clara_rezende, Orion Tech → organization:orion_tech, ...}

Scan fact_text for entity display names:
  "Clara Rezende" found → link (fact_abc123, person:clara_rezende, is_primary=true)
  "Orion Tech" found → link (fact_abc123, organization:orion_tech, is_primary=false)

Step 3 - Persist relationship:

clara_rezende → works_at → orion_tech (strength: 0.8)
Evidence: fact_abc123 (matched because fact_text mentions both "Clara Rezende" and "Orion Tech")

Result: 1 fact, 2 entity links, 1 relationship. When someone asks "Who works at Orion Tech?", the system finds this fact via the organization:orion_tech link - without needing a separate fact about Orion Tech.

Relationship Tracking¶

During upsert, extracted relationships are also persisted:

Creates/updates MemoryEntityRelationship records
Resolves source and target entities via the entity map
Strength reinforcement: repeated relationships increase strength (initial: 0.8, reinforced up to 1.0 across multiple messages)
Self-referencing filter: Relations where source and target resolve to the same entity (e.g., "caroline child_of caroline" after entity resolution) are silently filtered out. These typically arise from extraction artifacts or alias normalization.
Uses ON CONFLICT DO UPDATE for idempotent upserts

Relationships are unidirectional

Writing "Ana works at Acme" creates ana → works_at → acme_corp, but not acme_corp → employs → ana. This means graph retrieval starting from "Acme Corp" won't find Ana through relationships (but may still find her via semantic similarity). To create both directions, mention them explicitly: "Ana works at Acme. Acme has Ana as a data scientist."

Evidence Linkage & Cascade Invalidation¶

The problem: Without linkage between facts and relationships, contradictory edges accumulate. If a user says "I live in Curitiba" and later "I moved to São Paulo", the old relationship user --[lives_in]--> curitiba would remain active alongside the new one - polluting retrieval with stale context.

The solution: Each relationship is linked to the fact that supports it via evidence_fact_id. When that fact is superseded (UPDATE) or retracted (DELETE), the relationship is automatically invalidated - no manual cleanup needed.

How evidence linkage works:

After facts are persisted in the upsert stage, a heuristic match associates each relationship with a corresponding fact. For a relationship (source, target), the matcher looks for facts whose fact_text mentions both entity names.
If multiple facts match, the one with the highest confidence is selected.
The matched fact's ID is stored as evidence_fact_id on the relationship.

When a fact is invalidated (via UPDATE or DELETE), cascade invalidation automatically sets invalidated_at and valid_to on all relationships that reference it. The graph retrieval BFS already filters out invalidated relationships, so stale edges are immediately excluded from context.

User: "I live in Curitiba"
  → fact: "User lives in Curitiba" (fact_1)
  → rel:  user --[lives_in]--> curitiba (evidence_fact_id = fact_1)

User: "I moved to São Paulo"
  → reconciliation: UPDATE fact_1 → fact_2 "User lives in São Paulo"
  → cascade: rel lives_in→curitiba is INVALIDATED (evidence_fact_id = fact_1)
  → new rel: user --[lives_in]--> sao_paulo (evidence_fact_id = fact_2)

Relationship types are dynamic

The rel_type field accepts any descriptive snake_case string - not just a fixed set. Common types include works_at, lives_in, family_of, but the LLM may also produce types like mentored_by or inspired_by. See Dynamic Relationship Types for details on normalization and aliases.

Mirror Facts¶

Sometimes the LLM infers a relationship from context without extracting a corresponding fact. For example, "I'm going to Curitiba to visit my mom" implies mom --[lives_in]--> curitiba, but the LLM may only extract a fact about the user's trip - not about where mom lives. Without a fact, the relationship can't participate in cascade invalidation and isn't findable via semantic search.

To solve this, a mirror fact is automatically created as a fallback when no heuristic match is found. The mirror fact is a simple natural-language sentence generated from the relationship: "{source_name} {rel_type} {target_name}" (e.g., "Mom lives in Curitiba").

Mirror facts go through the canonical pipeline. Starting in v0.11.7, mirror facts are not persisted directly — they are built as synthetic ExtractedFact objects and routed through the same reconcile_facts → execute_upsert path as facts extracted by the LLM. This guarantees:

Semantic dedup: the reconciler catches near-duplicates like "Pedro lives in Brazil" vs "Pedro lives in Brasil" via embedding similarity + LLM reasoning — not just exact string match.
Speaker propagation: the speaker field is populated on mirror facts via the same mechanism as regular facts, so provenance is preserved.
Single entry point: the only way a MemoryFact is created in the write pipeline is through _add_fact (for ADD) or _update_fact (for versioned UPDATE). No parallel paths.

Mirror facts are marked with:

confidence = 0.60 (weak inference - lower priority in retrieval ranking)
source_context = "inferred_from_relation" (allows filtering or downranking if needed)

Mirror facts may persist after source invalidation

Mirror facts are not automatically invalidated when the source relationship is removed. They may persist as stale data. Applications should consider filtering by source_context when accuracy is critical.

The mirror fact's ID is used as the relationship's evidence_fact_id, so cascade invalidation works for inferred relationships too. When the reconciler decides NOOP (an equivalent fact already exists), the existing fact's ID is used as the evidence link instead — strengthening the evidence chain.

Reducing mirror facts

The extraction prompts instruct the LLM to extract implicit facts alongside relationships (e.g., "my mom lives in Curitiba" as a fact, not just a relation). As LLM extraction improves, fewer mirror facts are needed - they're the safety net, not the primary mechanism.

Transaction Safety¶

The entire write pipeline runs inside a database transaction. Individual fact upserts use savepoints (session.begin_nested()) so that a failure in one fact doesn't abort the entire batch:

# If this fact fails, only this savepoint rolls back
async with session.begin_nested():
    session.add(new_fact)
    await session.flush()

The event record is created and flushed first, so it survives even if all subsequent stages fail.

WriteResult¶

After the pipeline completes, you get a WriteResult with full observability:

result = await memory.write(
    agent_id="user_123",
    message="...",
    speaker_name="Rafael",
    recent_messages=["previous message for pronoun resolution"],  # optional
    occurred_at=datetime(2025, 6, 15, tzinfo=UTC),  # optional, for historical imports
)

# What happened
print(result.facts_added)       # List of facts created
print(result.facts_updated)     # List of facts superseded
print(result.facts_unchanged)   # List of confirmed facts (NOOP decisions)
print(result.facts_deleted)     # List of retracted facts (DELETE decisions)
print(result.entities_resolved) # List of resolved entities
print(result.duration_ms)       # Total pipeline time
print(result.event_id)          # Unique event ID for this write
print(result.tokens_used)       # TokenUsage(input_tokens=..., output_tokens=..., total_tokens=...)
print(result.pipeline)          # PipelineTrace (when verbose=True)
print(result.success)           # True if pipeline completed without errors (always check this)
print(result.error)             # Error message if pipeline failed (None on success)

Trace Enrichment (verbose=True)¶

When verbose=True, the extraction step in PipelineTrace includes additional metadata:

Field	Type	Description
`relation_retry_triggered`	`bool`	Whether the automatic relation retry was used

result = await memory.write(agent_id, message, speaker_name="Rafael", verbose=True)
extraction_step = result.pipeline.steps[0]  # "extraction"
print(extraction_step.data["relation_retry_triggered"])  # True/False

Token Usage¶

tokens_used reports the total LLM tokens consumed across all calls in the pipeline (extraction, entity resolution, reconciliation). Useful for benchmarking and cost estimation.

result = await memory.write(agent_id, message, speaker_name="Rafael")
print(result.tokens_used.input_tokens)   # e.g. 1200
print(result.tokens_used.output_tokens)  # e.g. 350
print(result.tokens_used.total_tokens)   # e.g. 1550

Token tracking requires provider support

tokens_used is populated from LLMResult.usage returned by your LLMProvider. The built-in OpenAIProvider reports usage automatically. Custom providers that return LLMResult(text=..., usage=None) will show zero tokens.

Config Overrides¶

Override any MemoryConfig field for a single write() call without creating a new client:

result = await memory.write(
    agent_id="user_123",
    message="...",
    speaker_name="Rafael",
    config_overrides={"extraction_timeout_sec": 60.0},
)

Only the provided keys are changed; all others inherit from the client config. Invalid keys emit a warning and are ignored. Type mismatches raise ValueError.

Dry Run¶

Run extraction without persisting anything to the database:

result = await memory.write(
    agent_id="user_123",
    message="I live in São Paulo with my wife Ana",
    speaker_name="Rafael",
    dry_run=True,
)
# result.facts_added contains what WOULD be extracted
# result.tokens_used shows cost of this extraction
# No event, no facts, no entities persisted

Useful for benchmarking: run the same message with dry_run=True and compare tokens_used across different configurations.

Pipeline Diagram (Complete)¶

flowchart TD
    MSG["User message"] --> EVT["Create MemoryEvent\n(immutable log + embedding)"]
    EVT --> ALIAS["Alias Lookup\n(word-boundary match on alias cache)"]
    ALIAS --> PRE["Pre-retrieval / Profile Load\n(pgvector top-K or entity profiles)"]
    PRE --> INF{"Informed\nExtraction"}
    INF -->|Success| DEDUP["Semantic Dedup\n(remove near-duplicates)"]
    INF -->|Failure| FALLBACK["Fallback: Blind Extraction\n(Entity Scan → Facts + Relations)"]
    FALLBACK --> DEDUP
    DEDUP --> RES["Entity Resolution\n(exact → fuzzy → LLM)"]
    RES --> REC{"Informed\npath?"}
    REC -->|Yes| UPS["Upsert + Entity Links + Profiles\n(with savepoints)"]
    REC -->|No: fallback| RECONCILE["Reconciliation\n(ADD / UPDATE / NOOP / DELETE)"]
    RECONCILE --> UPS
    UPS --> REL["Relationship Tracking\n(strength reinforcement)"]
    REL --> WR["WriteResult"]

Write Pipeline¶

Overview¶

Stage 1: Memory-Aware Extraction¶

How It Works (Informed Extraction -- Default)¶

Fallback: Blind Extraction (Legacy)¶

Configuration¶

What Gets Extracted¶

Alias Grouping & Subject Normalization¶

Entity Types¶

Language of Generated Content¶

Fail-safe Behavior¶

Entity Profiles¶

How Profiles Work¶

Profiles vs Summaries¶

Profiles and Retrieval¶

Stage 2: Entity Resolution¶

Three-Phase Resolution¶

Special Cases¶

Alias Registration¶

Entity Persistence¶

Configuration¶

Stage 3: Reconciliation¶

Decision Logic¶

Reconciliation Performance¶

Fail-safe Behavior¶

Fact Versioning¶

Stage 4: Upsert¶

Fact-Entity Links¶

Relationship Tracking¶

Evidence Linkage & Cascade Invalidation¶

Mirror Facts¶

Transaction Safety¶

WriteResult¶

Trace Enrichment (verbose=True)¶

Token Usage¶

Config Overrides¶

Dry Run¶

Pipeline Diagram (Complete)¶