Skip to content

Getting Started

This guide walks you through setting up arandu from scratch: installing dependencies, configuring PostgreSQL with pgvector, writing your first facts, and retrieving them.

Prerequisites

  • Python 3.11+
  • PostgreSQL 15+ with the pgvector extension installed
  • An OpenAI API key (or any LLM/embedding provider - see Custom Providers)

Step 1: Install

pip install arandu[openai]

This installs the core SDK plus the bundled OpenAI provider. If you're using a different LLM provider, install just the core:

pip install arandu

Step 2: Set Up PostgreSQL + pgvector

arandu stores facts, entities, and embeddings in PostgreSQL using the pgvector extension for vector similarity search.

docker run -d \
  --name memory-db \
  -e POSTGRES_USER=memory \
  -e POSTGRES_PASSWORD=memory \
  -e POSTGRES_DB=memory \
  -p 5432:5432 \
  pgvector/pgvector:pg16

The pgvector/pgvector image comes with the extension pre-installed. Your connection string will be: postgresql+psycopg://memory:memory@localhost:5432/memory

psycopg vs psycopg2

Arandu uses psycopg (async driver), not psycopg2 (sync). Your connection string must start with postgresql+psycopg://, not postgresql+psycopg2://. Many Django/Flask tutorials use psycopg2 - make sure you're using the right one.

Option B: Existing PostgreSQL

If you already have PostgreSQL running, enable the pgvector extension:

CREATE EXTENSION IF NOT EXISTS vector;

pgvector installation

If you don't have pgvector installed on your server, follow the pgvector installation guide.

Understanding Agent, Session, and Speaker

Every write() call uses three identifiers. Here's what each one does:

agent_id identifies whose memory this is. Think of it as a brain. One agent = one memory space. All facts, entities, and relationships live inside that agent's memory. If you have two chatbots, each one gets its own agent_id and they don't share memories.

speaker_name identifies who is talking. When someone says "I live in São Paulo", the SDK needs to know who "I" is. If the speaker is Rafael, "I" becomes "Rafael lives in São Paulo". Without speaker_name, the SDK doesn't know who "I" refers to and will raise a ValueError.

session_id tags the conversation context. It's optional (defaults to "default"). Use it when you want to track which conversation a message came from. For example, a support ticket, a therapy session, or a meeting.

Memory is NOT separated by session

Changing session_id does not create a separate memory. All facts go into the same agent memory regardless of session. When you call retrieve(), it searches everything the agent knows, across all sessions. The session_id is metadata on the event, not a partition key.

await memory.write(
    agent_id="my-assistant",        # whose memory
    message="I live in São Paulo",
    speaker_name="Rafael",          # who is talking
    session_id="support-ticket-42", # optional context tag
)

For a simple chatbot with one user, you only need agent_id and speaker_name. Add session_id when you want to track where a conversation happened.

Why three separate fields?

A memory system needs to answer three questions: whose brain stores this? (agent), who said it? (speaker), and in what context? (session). Mixing them into a single identifier breaks down when two people talk to the same agent, or the same person has multiple conversations. Separating them keeps the model clean and flexible.

Step 3: Initialize the Client

import asyncio
from arandu import MemoryClient, MemoryConfig
from arandu.providers.openai import OpenAIProvider

async def main():
    # Create the LLM + embedding provider
    provider = OpenAIProvider(api_key="sk-...")

    # Create the memory client
    memory = MemoryClient(
        database_url="postgresql+psycopg://memory:memory@localhost:5432/memory",
        llm=provider,
        embeddings=provider,
    )

    # Create tables (safe to call multiple times)
    await memory.initialize()

    print("Memory initialized!")
    await memory.close()

asyncio.run(main())

Using Anthropic (Claude) instead of OpenAI

from arandu.providers.anthropic import AnthropicProvider
from arandu.providers.openai import OpenAIProvider

llm = AnthropicProvider(api_key="sk-ant-...")  # Claude for LLM
embeddings = OpenAIProvider(api_key="sk-...")   # OpenAI for embeddings only

memory = MemoryClient(
    database_url="postgresql+psycopg://...",
    llm=llm,
    embeddings=embeddings,  # Anthropic doesn't offer embeddings
)
Install with: pip install arandu[anthropic]

Using DeepSeek, Groq, or local models

Any OpenAI-compatible provider works with OpenAIProvider. Just set base_url:

llm = OpenAIProvider(api_key="sk-...", model="deepseek-chat", base_url="https://api.deepseek.com/v1")
See the Cookbook for more examples.

initialize() creates all required tables and indexes (including pgvector HNSW indexes). It's idempotent - safe to call on every startup.

About agent_id

The agent_id is your partitioning key. Each agent_id gets its own isolated memory space - facts written for one agent are never returned for another. Think of it as a brain: one agent, one memory. Use any string (database ID, UUID, slug). The same agent_id must be used in both write() and retrieve() calls for the same agent.

About session_id

The session_id identifies the conversation context (default: "default"). Think of it like a WhatsApp chat thread - same agent, different conversations. When not provided, all writes go to the "default" session.

About speaker_name

The speaker_name identifies who is speaking the message. It is a required positional parameter in write(). Pronouns like "I", "me", "eu", "myself" automatically resolve to the speaker entity (person:{speaker_slug}). For example, if speaker_name="Rafael" and the message says "I live in São Paulo", the fact is attributed to Rafael - not to a generic user:self. Use the speaker's real name (e.g., "Rafael", "Ana").

Step 4: Write Your First Facts

The write() method takes a natural language message and automatically:

  1. Extracts entities, facts, and relationships using an LLM
  2. Resolves entities to canonical records (deduplication)
  3. Reconciles new facts against existing knowledge
  4. Upserts the results into the database
async def write_example(memory: MemoryClient):
    # First message
    result = await memory.write(
        agent_id="user_123",
        message="My name is Rafael and I live in São Paulo. I work at Acme Corp as a backend engineer.",
        speaker_name="Rafael",
    )
    print(f"Facts added: {len(result.facts_added)}")
    for fact in result.facts_added:
        print(f"  [{fact.entity_name}] {fact.fact_text} (confidence: {fact.confidence})")
    # Output:
    #   [Rafael] Lives in São Paulo (confidence: 0.95)
    #   [Rafael] Works at Acme Corp as a backend engineer (confidence: 0.95)
    #   [Acme Corp] Rafael works at Acme Corp (confidence: 0.95)
    print(f"Entities resolved: {len(result.entities_resolved)}")
    print(f"Duration: {result.duration_ms:.0f}ms")

    # Second message — the system recognizes "Rafael" and updates knowledge
    result = await memory.write(
        agent_id="user_123",
        message="I just moved to Rio de Janeiro. Still working at Acme though.",
        speaker_name="Rafael",
        session_id="onboarding",  # optional — defaults to "default"
    )
    print(f"Facts added: {len(result.facts_added)}")
    print(f"Facts updated: {len(result.facts_updated)}")  # "lives in São Paulo" → "lives in Rio"

Understanding WriteResult

The WriteResult object tells you exactly what happened:

Field Type Description
event_id str Unique ID for this write event
facts_added list New facts created (ADD decisions)
facts_updated list Existing facts superseded (UPDATE decisions)
facts_unchanged list Facts confirmed but not changed (NOOP decisions)
facts_deleted list Facts retracted (DELETE decisions)
entities_resolved list Entities identified and resolved
duration_ms float Total pipeline duration
success bool Whether the pipeline completed without errors
error str \| None Error message if the pipeline failed internally

Step 5: Retrieve Context

The retrieve() method finds facts relevant to a query using multiple signals:

async def retrieve_example(memory: MemoryClient):
    result = await memory.retrieve(
        agent_id="user_123",
        query="where does Rafael live and what does he do?",
    )

    # Option 1: Pre-formatted string — paste directly into your LLM prompt
    print(result.context)

    # Option 2: Individual scored facts — for programmatic access
    for fact in result.facts:
        print(f"  [{fact.score:.2f}] {fact.entity_name}: {fact.fact_text}")

    # Retrieve within a specific session (optional — omit to search all sessions)
    session_result = await memory.retrieve(
        agent_id="user_123",
        query="where does Rafael live?",
        session_id="onboarding",  # optional — defaults to searching all sessions
    )

    # With config adjustments (e.g., disable reranker for faster results)
    fast_result = await memory.retrieve(
        agent_id="user_123",
        query="where does Rafael live?",
        config_overrides={"enable_reranker": False, "topk_facts": 5},
    )

    print(f"Total candidates evaluated: {result.total_candidates}")
    print(f"Duration: {result.duration_ms:.0f}ms")

.context vs .facts

Use result.context when you just need a string to inject into an LLM prompt - it's pre-formatted with tier labels (CORE MEMORY, EXTENDED CONTEXT, etc.). Use result.facts when you need programmatic access to individual facts, scores, and metadata.

Per-request Config Overrides

You can override any MemoryConfig field for a single request without changing the client's default config:

result = await memory.retrieve(
    agent_id="user_123",
    query="where does Rafael live?",
    config_overrides={
        "enable_reranker": False,
        "topk_facts": 5,
        "spreading_activation_hops": 0,
    },
)

# config_effective shows the actual config used for this request
print(result.config_effective)

Only the provided keys are overridden; all other fields inherit from the client's MemoryConfig.

Understanding RetrieveResult

Field Type Description
facts list[ScoredFact] Ranked facts with scores
context str Pre-formatted context string for LLM prompts
total_candidates int Total facts evaluated before ranking
duration_ms float Total pipeline duration
config_effective dict Effective config values used for this request

Each ScoredFact contains:

Field Type Description
fact_id str Unique fact identifier
entity_name str Human-readable entity name
attribute_key str Fact category/attribute
fact_text str The fact content
score float Combined relevance score (0-1)
scores dict Breakdown by signal (semantic, recency, etc.)
speaker str \| None Who spoke the message this fact was extracted from

Step 6: Managing Facts

Beyond write() and retrieve(), the SDK provides CRUD operations for managing individual facts: fetching by ID, listing all facts, and deleting.

Get a specific fact

detail = await memory.get(agent_id="user_123", fact_id="some-uuid-here")
if detail:
    print(f"[{detail.entity_name}] {detail.fact_text}")
    print(f"  confidence: {detail.confidence}, importance: {detail.importance}")
    print(f"  created: {detail.created_at}")
else:
    print("Fact not found")

get() returns a FactDetail or None. It fetches any fact by ID — including facts that were soft-deleted by the reconciliation pipeline (i.e. superseded by a newer version). Use it for direct lookups when you have the ID.

List all facts

# First page (newest first)
facts = await memory.get_all(agent_id="user_123", limit=50, offset=0)
for fact in facts:
    print(f"[{fact.fact_id}] {fact.entity_name}: {fact.fact_text}")

# Next page
page2 = await memory.get_all(agent_id="user_123", limit=50, offset=50)

get_all() returns only active facts (valid_to IS NULL), ordered by created_at descending. Use limit and offset for pagination.

Filtering by entity

# Only facts about Ana
ana_facts = await memory.get_all(agent_id="user_123", entity_keys=["person:ana"])

# Facts about Pedro OR Ana
facts = await memory.get_all(agent_id="user_123", entity_keys=["person:pedro", "person:ana"])

The entity_keys filter also works with retrieve():

# Semantic search scoped to facts about Ana
result = await memory.retrieve(
    agent_id="user_123",
    query="what does she do for work?",
    entity_keys=["person:ana"],
)

When entity_keys is provided, only facts linked to at least one of the specified entities are returned (OR logic). Without entity_keys, all facts are searched as before.

Aliases are resolved automatically

entity_keys accepts both canonical keys (person:pedro_menezes) and aliases (person:pedro or just pedro). The SDK resolves aliases against the memory_entity_aliases table before filtering, so you don't need to know the canonical form. Any key that does not resolve is surfaced in result.warningsretrieve() never returns "zero silently" because of a bad key.

result = await memory.retrieve(
    agent_id="user_123",
    query="what does she do?",
    entity_keys=["person:pedro", "person:unknown"],
)
# result.facts → filtered by Pedro's canonical key (alias resolved)
# result.warnings → ["entity_key 'person:unknown' not found (not canonical, no matching alias)"]

Delete a fact

deleted = await memory.delete(agent_id="user_123", fact_id="some-uuid-here")
print(f"Deleted: {deleted}")  # True if found and removed, False otherwise

delete() performs a hard delete — the row is physically removed from the database. Associated entity links are removed automatically via cascade. This is the explicit user action ("I want this gone"); the pipeline's soft-delete via valid_to is a separate mechanism for reconciliation.

Delete all facts

count = await memory.delete_all(agent_id="user_123")
print(f"Deleted {count} facts")

delete_all() removes every fact belonging to the agent. Use with caution — this is irreversible. Intended for reset/debug scenarios.

List entities

entities = await memory.entities(agent_id="user_123", limit=50)
for entity in entities:
    print(f"[{entity.entity_type}] {entity.display_name} ({entity.fact_count} facts)")
    if entity.summary_text:
        print(f"  Summary: {entity.summary_text}")

entities() returns active entities ordered by last_seen_at descending. Use it to see what the agent knows about — people, places, organizations, etc.

Understanding FactDetail

Field Type Description
fact_id str Unique fact identifier
entity_name str Human-readable entity name
entity_key str Canonical entity key
entity_type str Entity type (e.g. "person", "organization")
attribute_key str \| None Fact category/attribute
fact_text str The fact content
category str \| None Fact category
confidence float Confidence score (0-1)
importance float Importance score (0-1)
valid_from datetime \| None When the fact became valid
created_at datetime \| None When the fact was created
source_context str \| None Original context snippet
speaker str \| None Who spoke the message this fact was extracted from

Understanding EntityDetail

Field Type Description
entity_id str Unique entity identifier
canonical_key str Canonical entity key (e.g. "person::rafael")
display_name str Human-readable entity name
entity_type str Entity type (e.g. "person", "organization")
summary_text str \| None Auto-generated entity summary
fact_count int Number of facts linked to this entity
importance_score float \| None Computed importance score
first_seen_at datetime \| None When the entity was first mentioned
last_seen_at datetime \| None When the entity was last mentioned
profile_text str \| None Consolidated entity profile

Step 7: Configure (Optional)

Every aspect of the pipeline is configurable via MemoryConfig:

from arandu import MemoryConfig
from arandu.providers.openai import OpenAIProvider

# Single provider for all LLM operations (extraction, reranker, etc.)
llm = OpenAIProvider(api_key="sk-...", model="gpt-4o")

config = MemoryConfig(
    # Tight timeout for real-time chat
    extraction_timeout_sec=15.0,

    # Tune retrieval
    topk_facts=30,
    min_similarity=0.25,
    enable_reranker=True,

    # Custom score weights (default: semantic=0.70, recency=0.20, importance=0.10)
    score_weights={
        "semantic": 0.60,
        "recency": 0.25,
        "importance": 0.15,
    },

    # Set timezone for recency calculations
    timezone="America/Sao_Paulo",
)

memory = MemoryClient(
    database_url="postgresql+psycopg://memory:memory@localhost/memory",
    llm=llm,
    embeddings=llm,
    config=config,
)

All parameters have sensible defaults - you only need to override what matters for your use case.

Step 8: Debugging with Verbose Mode

Pass verbose=True to write() or retrieve() to get a detailed trace of every pipeline step:

result = await memory.write(agent_id="user_123", message="...", speaker_name="Rafael", verbose=True)

# Access the pipeline trace
if result.pipeline:
    for step in result.pipeline.steps:
        print(f"  {step.name}: {step.duration_ms:.1f}ms")
        print(f"    data: {step.data}")

The trace includes steps like extraction, entity_resolution, reconciliation, and upsert, each with timing and intermediate data. If the pipeline fails internally, an error step is added with the exception details - useful for diagnosing silent failures.

You can serialize the full trace with result.pipeline.to_dict().

Step 9: Cleanup

Always close the client when done to release database connections:

await memory.close()

Or use it as an async context pattern:

memory = MemoryClient(...)
await memory.initialize()
try:
    # ... use memory
finally:
    await memory.close()

Complete Example

Here's a full working example putting it all together:

import asyncio
from arandu import MemoryClient
from arandu.providers.openai import OpenAIProvider


async def main():
    provider = OpenAIProvider(api_key="sk-...")
    memory = MemoryClient(
        database_url="postgresql+psycopg://memory:memory@localhost:5432/memory",
        llm=provider,
        embeddings=provider,
    )
    await memory.initialize()

    try:
        # Write some facts
        await memory.write(
            agent_id="user_123",
            message="I'm a software engineer living in Berlin. I love cycling and craft coffee.",
            speaker_name="Rafael",
        )
        await memory.write(
            agent_id="user_123",
            message="My girlfriend Ana is a designer. We adopted a cat named Pixel last month.",
            speaker_name="Rafael",
        )

        # Retrieve context
        result = await memory.retrieve(agent_id="user_123", query="tell me about this person")
        print(result.context)

        # Targeted retrieval
        result = await memory.retrieve(agent_id="user_123", query="who is Ana?")
        for fact in result.facts:
            print(f"  [{fact.score:.2f}] {fact.entity_name}: {fact.fact_text}")
    finally:
        await memory.close()


asyncio.run(main())

Custom Providers

arandu uses Python protocols for dependency injection. You can bring any LLM or embedding provider by implementing two simple interfaces:

from arandu.protocols import LLMProvider, LLMResult, EmbeddingProvider

class MyLLMProvider:
    async def complete(
        self,
        messages: list[dict],
        temperature: float = 0,
        response_format: dict | None = None,
        max_tokens: int | None = None,
    ) -> LLMResult:
        # Call your LLM here
        text = ...  # get response text from your LLM
        return LLMResult(text=text, usage=None)

class MyEmbeddingProvider:
    async def embed(self, texts: list[str]) -> list[list[float]]:
        # Return embeddings for a batch
        ...

    async def embed_one(self, text: str) -> list[float] | None:
        # Return embedding for a single text
        ...

No inheritance required - just implement the methods with the right signatures.

Next Steps

  • Write Pipeline - Understand how facts are extracted, entities resolved, and knowledge reconciled
  • Read Pipeline - Learn how multi-signal retrieval finds the most relevant facts
  • Data Types & Schema - Database schema reference (tables, columns, types) for direct SQL queries
  • Background Jobs - Set up clustering, consolidation, and importance scoring
  • Design Philosophy - Explore the neuroscience-inspired architecture