Getting Started¶

This guide walks you through setting up arandu from scratch: installing dependencies, configuring PostgreSQL with pgvector, writing your first facts, and retrieving them.

Prerequisites¶

Python 3.11+
PostgreSQL 15+ with the pgvector extension installed
An OpenAI API key (or any LLM/embedding provider - see Custom Providers)

Step 1: Install¶

pip install arandu[openai]

This installs the core SDK plus the bundled OpenAI provider. If you're using a different LLM provider, install just the core:

pip install arandu

Step 2: Set Up PostgreSQL + pgvector¶

arandu stores facts, entities, and embeddings in PostgreSQL using the pgvector extension for vector similarity search.

Option A: Docker (recommended for development)¶

docker run -d \
  --name memory-db \
  -e POSTGRES_USER=memory \
  -e POSTGRES_PASSWORD=memory \
  -e POSTGRES_DB=memory \
  -p 5432:5432 \
  pgvector/pgvector:pg16

The pgvector/pgvector image comes with the extension pre-installed. Your connection string will be: postgresql+psycopg://memory:memory@localhost:5432/memory

psycopg vs psycopg2

Arandu uses psycopg (async driver), not psycopg2 (sync). Your connection string must start with postgresql+psycopg://, not postgresql+psycopg2://. Many Django/Flask tutorials use psycopg2 - make sure you're using the right one.

Option B: Existing PostgreSQL¶

If you already have PostgreSQL running, enable the pgvector extension:

CREATE EXTENSION IF NOT EXISTS vector;

pgvector installation

If you don't have pgvector installed on your server, follow the pgvector installation guide.

Understanding Agent, Session, and Speaker¶

Every write() call uses three identifiers. Here's what each one does:

agent_id identifies whose memory this is. Think of it as a brain. One agent = one memory space. All facts, entities, and relationships live inside that agent's memory. If you have two chatbots, each one gets its own agent_id and they don't share memories.

speaker_name identifies who is talking. When someone says "I live in São Paulo", the SDK needs to know who "I" is. If the speaker is Rafael, "I" becomes "Rafael lives in São Paulo". Without speaker_name, the SDK doesn't know who "I" refers to and will raise a ValueError.

session_id tags the conversation context. It's optional (defaults to "default"). Use it when you want to track which conversation a message came from. For example, a support ticket, a therapy session, or a meeting.

Memory is NOT separated by session

Changing session_id does not create a separate memory. All facts go into the same agent memory regardless of session. When you call retrieve(), it searches everything the agent knows, across all sessions. The session_id is metadata on the event, not a partition key.

await memory.write(
    agent_id="my-assistant",        # whose memory
    message="I live in São Paulo",
    speaker_name="Rafael",          # who is talking
    session_id="support-ticket-42", # optional context tag
)

For a simple chatbot with one user, you only need agent_id and speaker_name. Add session_id when you want to track where a conversation happened.

Why three separate fields?

A memory system needs to answer three questions: whose brain stores this? (agent), who said it? (speaker), and in what context? (session). Mixing them into a single identifier breaks down when two people talk to the same agent, or the same person has multiple conversations. Separating them keeps the model clean and flexible.

Step 3: Initialize the Client¶

import asyncio
from arandu import MemoryClient, MemoryConfig
from arandu.providers.openai import OpenAIProvider

async def main():
    # Create the LLM + embedding provider
    provider = OpenAIProvider(api_key="sk-...")

    # Create the memory client
    memory = MemoryClient(
        database_url="postgresql+psycopg://memory:memory@localhost:5432/memory",
        llm=provider,
        embeddings=provider,
    )

    # Create tables (safe to call multiple times)
    await memory.initialize()

    print("Memory initialized!")
    await memory.close()

asyncio.run(main())

Using Anthropic (Claude) instead of OpenAI

from arandu.providers.anthropic import AnthropicProvider
from arandu.providers.openai import OpenAIProvider

llm = AnthropicProvider(api_key="sk-ant-...")  # Claude for LLM
embeddings = OpenAIProvider(api_key="sk-...")   # OpenAI for embeddings only

memory = MemoryClient(
    database_url="postgresql+psycopg://...",
    llm=llm,
    embeddings=embeddings,  # Anthropic doesn't offer embeddings
)

Install with: pip install arandu[anthropic]

Using DeepSeek, Groq, or local models

Any OpenAI-compatible provider works with OpenAIProvider. Just set base_url:

llm = OpenAIProvider(api_key="sk-...", model="deepseek-chat", base_url="https://api.deepseek.com/v1")

See the Cookbook for more examples.

initialize() creates all required tables and indexes (including pgvector HNSW indexes). It's idempotent - safe to call on every startup.

About agent_id

The agent_id is your partitioning key. Each agent_id gets its own isolated memory space - facts written for one agent are never returned for another. Think of it as a brain: one agent, one memory. Use any string (database ID, UUID, slug). The same agent_id must be used in both write() and retrieve() calls for the same agent.

About session_id

The session_id identifies the conversation context (default: "default"). Think of it like a WhatsApp chat thread - same agent, different conversations. When not provided, all writes go to the "default" session.

About speaker_name

The speaker_name identifies who is speaking the message. It is a required positional parameter in write(). Pronouns like "I", "me", "eu", "myself" automatically resolve to the speaker entity (person:{speaker_slug}). For example, if speaker_name="Rafael" and the message says "I live in São Paulo", the fact is attributed to Rafael - not to a generic user:self. Use the speaker's real name (e.g., "Rafael", "Ana").

Step 4: Write Your First Facts¶

The write() method takes a natural language message and automatically:

Extracts entities, facts, and relationships using an LLM
Resolves entities to canonical records (deduplication)
Reconciles new facts against existing knowledge
Upserts the results into the database

async def write_example(memory: MemoryClient):
    # First message
    result = await memory.write(
        agent_id="user_123",
        message="My name is Rafael and I live in São Paulo. I work at Acme Corp as a backend engineer.",
        speaker_name="Rafael",
    )
    print(f"Facts added: {len(result.facts_added)}")
    for fact in result.facts_added:
        print(f"  [{fact.entity_name}] {fact.fact_text} (confidence: {fact.confidence})")
    # Output:
    #   [Rafael] Lives in São Paulo (confidence: 0.95)
    #   [Rafael] Works at Acme Corp as a backend engineer (confidence: 0.95)
    #   [Acme Corp] Rafael works at Acme Corp (confidence: 0.95)
    print(f"Entities resolved: {len(result.entities_resolved)}")
    print(f"Duration: {result.duration_ms:.0f}ms")

    # Second message — the system recognizes "Rafael" and updates knowledge
    result = await memory.write(
        agent_id="user_123",
        message="I just moved to Rio de Janeiro. Still working at Acme though.",
        speaker_name="Rafael",
        session_id="onboarding",  # optional — defaults to "default"
    )
    print(f"Facts added: {len(result.facts_added)}")
    print(f"Facts updated: {len(result.facts_updated)}")  # "lives in São Paulo" → "lives in Rio"

Understanding WriteResult¶

The WriteResult object tells you exactly what happened:

Field	Type	Description
`event_id`	`str`	Unique ID for this write event
`facts_added`	`list`	New facts created (ADD decisions)
`facts_updated`	`list`	Existing facts superseded (UPDATE decisions)
`facts_unchanged`	`list`	Facts confirmed but not changed (NOOP decisions)
`facts_deleted`	`list`	Facts retracted (DELETE decisions)
`entities_resolved`	`list`	Entities identified and resolved
`duration_ms`	`float`	Total pipeline duration
`success`	`bool`	Whether the pipeline completed without errors
`error`	`str \\| None`	Error message if the pipeline failed internally

Step 5: Retrieve Context¶

The retrieve() method finds facts relevant to a query using multiple signals:

async def retrieve_example(memory: MemoryClient):
    result = await memory.retrieve(
        agent_id="user_123",
        query="where does Rafael live and what does he do?",
    )

    # Option 1: Pre-formatted string — paste directly into your LLM prompt
    print(result.context)

    # Option 2: Individual scored facts — for programmatic access
    for fact in result.facts:
        print(f"  [{fact.score:.2f}] {fact.entity_name}: {fact.fact_text}")

    # Retrieve within a specific session (optional — omit to search all sessions)
    session_result = await memory.retrieve(
        agent_id="user_123",
        query="where does Rafael live?",
        session_id="onboarding",  # optional — defaults to searching all sessions
    )

    # With config adjustments (e.g., disable reranker for faster results)
    fast_result = await memory.retrieve(
        agent_id="user_123",
        query="where does Rafael live?",
        config_overrides={"enable_reranker": False, "topk_facts": 5},
    )

    print(f"Total candidates evaluated: {result.total_candidates}")
    print(f"Duration: {result.duration_ms:.0f}ms")

.context vs .facts

Use result.context when you just need a string to inject into an LLM prompt - it's pre-formatted with tier labels (CORE MEMORY, EXTENDED CONTEXT, etc.). Use result.facts when you need programmatic access to individual facts, scores, and metadata.

Per-request Config Overrides¶

You can override any MemoryConfig field for a single request without changing the client's default config:

result = await memory.retrieve(
    agent_id="user_123",
    query="where does Rafael live?",
    config_overrides={
        "enable_reranker": False,
        "topk_facts": 5,
        "spreading_activation_hops": 0,
    },
)

# config_effective shows the actual config used for this request
print(result.config_effective)

Only the provided keys are overridden; all other fields inherit from the client's MemoryConfig.

Understanding RetrieveResult¶

Field	Type	Description
`facts`	`list[ScoredFact]`	Ranked facts with scores
`context`	`str`	Pre-formatted context string for LLM prompts
`total_candidates`	`int`	Total facts evaluated before ranking
`duration_ms`	`float`	Total pipeline duration
`config_effective`	`dict`	Effective config values used for this request

Each ScoredFact contains:

Field	Type	Description
`fact_id`	`str`	Unique fact identifier
`entity_name`	`str`	Human-readable entity name
`attribute_key`	`str`	Fact category/attribute
`fact_text`	`str`	The fact content
`score`	`float`	Combined relevance score (0-1)
`scores`	`dict`	Breakdown by signal (semantic, recency, etc.)
`speaker`	`str \\| None`	Who spoke the message this fact was extracted from

Step 6: Managing Facts¶

Beyond write() and retrieve(), the SDK provides CRUD operations for managing individual facts: fetching by ID, listing all facts, and deleting.

Get a specific fact¶

detail = await memory.get(agent_id="user_123", fact_id="some-uuid-here")
if detail:
    print(f"[{detail.entity_name}] {detail.fact_text}")
    print(f"  confidence: {detail.confidence}, importance: {detail.importance}")
    print(f"  created: {detail.created_at}")
else:
    print("Fact not found")

get() returns a FactDetail or None. It fetches any fact by ID — including facts that were soft-deleted by the reconciliation pipeline (i.e. superseded by a newer version). Use it for direct lookups when you have the ID.

List all facts¶

# First page (newest first)
facts = await memory.get_all(agent_id="user_123", limit=50, offset=0)
for fact in facts:
    print(f"[{fact.fact_id}] {fact.entity_name}: {fact.fact_text}")

# Next page
page2 = await memory.get_all(agent_id="user_123", limit=50, offset=50)

get_all() returns only active facts (valid_to IS NULL), ordered by created_at descending. Use limit and offset for pagination.

Filtering by entity¶

# Only facts about Ana
ana_facts = await memory.get_all(agent_id="user_123", entity_keys=["person:ana"])

# Facts about Pedro OR Ana
facts = await memory.get_all(agent_id="user_123", entity_keys=["person:pedro", "person:ana"])

The entity_keys filter also works with retrieve():

# Semantic search scoped to facts about Ana
result = await memory.retrieve(
    agent_id="user_123",
    query="what does she do for work?",
    entity_keys=["person:ana"],
)

When entity_keys is provided, only facts linked to at least one of the specified entities are returned (OR logic). Without entity_keys, all facts are searched as before.

Aliases are resolved automatically

entity_keys accepts both canonical keys (person:pedro_menezes) and aliases (person:pedro or just pedro). The SDK resolves aliases against the memory_entity_aliases table before filtering, so you don't need to know the canonical form. Any key that does not resolve is surfaced in result.warnings — retrieve() never returns "zero silently" because of a bad key.

result = await memory.retrieve(
    agent_id="user_123",
    query="what does she do?",
    entity_keys=["person:pedro", "person:unknown"],
)
# result.facts → filtered by Pedro's canonical key (alias resolved)
# result.warnings → ["entity_key 'person:unknown' not found (not canonical, no matching alias)"]

Delete a fact¶

deleted = await memory.delete(agent_id="user_123", fact_id="some-uuid-here")
print(f"Deleted: {deleted}")  # True if found and removed, False otherwise

delete() performs a hard delete — the row is physically removed from the database. Associated entity links are removed automatically via cascade. This is the explicit user action ("I want this gone"); the pipeline's soft-delete via valid_to is a separate mechanism for reconciliation.

Delete all facts¶

count = await memory.delete_all(agent_id="user_123")
print(f"Deleted {count} facts")

delete_all() removes every fact belonging to the agent. Use with caution — this is irreversible. Intended for reset/debug scenarios.

List entities¶

entities = await memory.entities(agent_id="user_123", limit=50)
for entity in entities:
    print(f"[{entity.entity_type}] {entity.display_name} ({entity.fact_count} facts)")
    if entity.summary_text:
        print(f"  Summary: {entity.summary_text}")

entities() returns active entities ordered by last_seen_at descending. Use it to see what the agent knows about — people, places, organizations, etc.

Understanding FactDetail¶

Field	Type	Description
`fact_id`	`str`	Unique fact identifier
`entity_name`	`str`	Human-readable entity name
`entity_key`	`str`	Canonical entity key
`entity_type`	`str`	Entity type (e.g. "person", "organization")
`attribute_key`	`str \\| None`	Fact category/attribute
`fact_text`	`str`	The fact content
`category`	`str \\| None`	Fact category
`confidence`	`float`	Confidence score (0-1)
`importance`	`float`	Importance score (0-1)
`valid_from`	`datetime \\| None`	When the fact became valid
`created_at`	`datetime \\| None`	When the fact was created
`source_context`	`str \\| None`	Original context snippet
`speaker`	`str \\| None`	Who spoke the message this fact was extracted from

Understanding EntityDetail¶

Field	Type	Description
`entity_id`	`str`	Unique entity identifier
`canonical_key`	`str`	Canonical entity key (e.g. "person::rafael")
`display_name`	`str`	Human-readable entity name
`entity_type`	`str`	Entity type (e.g. "person", "organization")
`summary_text`	`str \\| None`	Auto-generated entity summary
`fact_count`	`int`	Number of facts linked to this entity
`importance_score`	`float \\| None`	Computed importance score
`first_seen_at`	`datetime \\| None`	When the entity was first mentioned
`last_seen_at`	`datetime \\| None`	When the entity was last mentioned
`profile_text`	`str \\| None`	Consolidated entity profile

Step 7: Configure (Optional)¶

Every aspect of the pipeline is configurable via MemoryConfig:

from arandu import MemoryConfig
from arandu.providers.openai import OpenAIProvider

# Single provider for all LLM operations (extraction, reranker, etc.)
llm = OpenAIProvider(api_key="sk-...", model="gpt-4o")

config = MemoryConfig(
    # Tight timeout for real-time chat
    extraction_timeout_sec=15.0,

    # Tune retrieval
    topk_facts=30,
    min_similarity=0.25,
    enable_reranker=True,

    # Custom score weights (default: semantic=0.70, recency=0.20, importance=0.10)
    score_weights={
        "semantic": 0.60,
        "recency": 0.25,
        "importance": 0.15,
    },

    # Set timezone for recency calculations
    timezone="America/Sao_Paulo",
)

memory = MemoryClient(
    database_url="postgresql+psycopg://memory:memory@localhost/memory",
    llm=llm,
    embeddings=llm,
    config=config,
)

All parameters have sensible defaults - you only need to override what matters for your use case.

Step 8: Debugging with Verbose Mode¶

Pass verbose=True to write() or retrieve() to get a detailed trace of every pipeline step:

result = await memory.write(agent_id="user_123", message="...", speaker_name="Rafael", verbose=True)

# Access the pipeline trace
if result.pipeline:
    for step in result.pipeline.steps:
        print(f"  {step.name}: {step.duration_ms:.1f}ms")
        print(f"    data: {step.data}")

The trace includes steps like extraction, entity_resolution, reconciliation, and upsert, each with timing and intermediate data. If the pipeline fails internally, an error step is added with the exception details - useful for diagnosing silent failures.

You can serialize the full trace with result.pipeline.to_dict().

Step 9: Cleanup¶

Always close the client when done to release database connections:

await memory.close()

Or use it as an async context pattern:

memory = MemoryClient(...)
await memory.initialize()
try:
    # ... use memory
finally:
    await memory.close()

Complete Example¶

Here's a full working example putting it all together:

import asyncio
from arandu import MemoryClient
from arandu.providers.openai import OpenAIProvider


async def main():
    provider = OpenAIProvider(api_key="sk-...")
    memory = MemoryClient(
        database_url="postgresql+psycopg://memory:memory@localhost:5432/memory",
        llm=provider,
        embeddings=provider,
    )
    await memory.initialize()

    try:
        # Write some facts
        await memory.write(
            agent_id="user_123",
            message="I'm a software engineer living in Berlin. I love cycling and craft coffee.",
            speaker_name="Rafael",
        )
        await memory.write(
            agent_id="user_123",
            message="My girlfriend Ana is a designer. We adopted a cat named Pixel last month.",
            speaker_name="Rafael",
        )

        # Retrieve context
        result = await memory.retrieve(agent_id="user_123", query="tell me about this person")
        print(result.context)

        # Targeted retrieval
        result = await memory.retrieve(agent_id="user_123", query="who is Ana?")
        for fact in result.facts:
            print(f"  [{fact.score:.2f}] {fact.entity_name}: {fact.fact_text}")
    finally:
        await memory.close()


asyncio.run(main())

Custom Providers¶

arandu uses Python protocols for dependency injection. You can bring any LLM or embedding provider by implementing two simple interfaces:

from arandu.protocols import LLMProvider, LLMResult, EmbeddingProvider

class MyLLMProvider:
    async def complete(
        self,
        messages: list[dict],
        temperature: float = 0,
        response_format: dict | None = None,
        max_tokens: int | None = None,
    ) -> LLMResult:
        # Call your LLM here
        text = ...  # get response text from your LLM
        return LLMResult(text=text, usage=None)

class MyEmbeddingProvider:
    async def embed(self, texts: list[str]) -> list[list[float]]:
        # Return embeddings for a batch
        ...

    async def embed_one(self, text: str) -> list[float] | None:
        # Return embedding for a single text
        ...

No inheritance required - just implement the methods with the right signatures.

Next Steps¶

Write Pipeline - Understand how facts are extracted, entities resolved, and knowledge reconciled
Read Pipeline - Learn how multi-signal retrieval finds the most relevant facts
Data Types & Schema - Database schema reference (tables, columns, types) for direct SQL queries
Background Jobs - Set up clustering, consolidation, and importance scoring
Design Philosophy - Explore the neuroscience-inspired architecture