Getting Started¶
This guide walks you through setting up arandu from scratch: installing dependencies, configuring PostgreSQL with pgvector, writing your first facts, and retrieving them.
Prerequisites¶
- Python 3.11+
- PostgreSQL 15+ with the pgvector extension installed
- An OpenAI API key (or any LLM/embedding provider - see Custom Providers)
Step 1: Install¶
This installs the core SDK plus the bundled OpenAI provider. If you're using a different LLM provider, install just the core:
Step 2: Set Up PostgreSQL + pgvector¶
arandu stores facts, entities, and embeddings in PostgreSQL using the pgvector extension for vector similarity search.
Option A: Docker (recommended for development)¶
docker run -d \
--name memory-db \
-e POSTGRES_USER=memory \
-e POSTGRES_PASSWORD=memory \
-e POSTGRES_DB=memory \
-p 5432:5432 \
pgvector/pgvector:pg16
The pgvector/pgvector image comes with the extension pre-installed. Your connection string will be:
postgresql+psycopg://memory:memory@localhost:5432/memory
psycopg vs psycopg2
Arandu uses psycopg (async driver), not psycopg2 (sync). Your connection string must start with postgresql+psycopg://, not postgresql+psycopg2://. Many Django/Flask tutorials use psycopg2 - make sure you're using the right one.
Option B: Existing PostgreSQL¶
If you already have PostgreSQL running, enable the pgvector extension:
pgvector installation
If you don't have pgvector installed on your server, follow the pgvector installation guide.
Understanding Agent, Session, and Speaker¶
Every write() call uses three identifiers. Here's what each one does:
agent_id identifies whose memory this is. Think of it as a brain. One agent = one memory space. All facts, entities, and relationships live inside that agent's memory. If you have two chatbots, each one gets its own agent_id and they don't share memories.
speaker_name identifies who is talking. When someone says "I live in São Paulo", the SDK needs to know who "I" is. If the speaker is Rafael, "I" becomes "Rafael lives in São Paulo". Without speaker_name, the SDK doesn't know who "I" refers to and will raise a ValueError.
session_id tags the conversation context. It's optional (defaults to "default"). Use it when you want to track which conversation a message came from. For example, a support ticket, a therapy session, or a meeting.
Memory is NOT separated by session
Changing session_id does not create a separate memory. All facts go into the same agent memory regardless of session. When you call retrieve(), it searches everything the agent knows, across all sessions. The session_id is metadata on the event, not a partition key.
await memory.write(
agent_id="my-assistant", # whose memory
message="I live in São Paulo",
speaker_name="Rafael", # who is talking
session_id="support-ticket-42", # optional context tag
)
For a simple chatbot with one user, you only need agent_id and speaker_name. Add session_id when you want to track where a conversation happened.
Why three separate fields?
A memory system needs to answer three questions: whose brain stores this? (agent), who said it? (speaker), and in what context? (session). Mixing them into a single identifier breaks down when two people talk to the same agent, or the same person has multiple conversations. Separating them keeps the model clean and flexible.
Step 3: Initialize the Client¶
import asyncio
from arandu import MemoryClient, MemoryConfig
from arandu.providers.openai import OpenAIProvider
async def main():
# Create the LLM + embedding provider
provider = OpenAIProvider(api_key="sk-...")
# Create the memory client
memory = MemoryClient(
database_url="postgresql+psycopg://memory:memory@localhost:5432/memory",
llm=provider,
embeddings=provider,
)
# Create tables (safe to call multiple times)
await memory.initialize()
print("Memory initialized!")
await memory.close()
asyncio.run(main())
Using Anthropic (Claude) instead of OpenAI
from arandu.providers.anthropic import AnthropicProvider
from arandu.providers.openai import OpenAIProvider
llm = AnthropicProvider(api_key="sk-ant-...") # Claude for LLM
embeddings = OpenAIProvider(api_key="sk-...") # OpenAI for embeddings only
memory = MemoryClient(
database_url="postgresql+psycopg://...",
llm=llm,
embeddings=embeddings, # Anthropic doesn't offer embeddings
)
pip install arandu[anthropic]
Using DeepSeek, Groq, or local models
Any OpenAI-compatible provider works with OpenAIProvider. Just set base_url:
llm = OpenAIProvider(api_key="sk-...", model="deepseek-chat", base_url="https://api.deepseek.com/v1")
initialize() creates all required tables and indexes (including pgvector HNSW indexes). It's idempotent - safe to call on every startup.
About agent_id
The agent_id is your partitioning key. Each agent_id gets its own isolated memory space - facts written for one agent are never returned for another. Think of it as a brain: one agent, one memory. Use any string (database ID, UUID, slug). The same agent_id must be used in both write() and retrieve() calls for the same agent.
About session_id
The session_id identifies the conversation context (default: "default"). Think of it like a WhatsApp chat thread - same agent, different conversations. When not provided, all writes go to the "default" session.
About speaker_name
The speaker_name identifies who is speaking the message. It is a required positional parameter in write(). Pronouns like "I", "me", "eu", "myself" automatically resolve to the speaker entity (person:{speaker_slug}). For example, if speaker_name="Rafael" and the message says "I live in São Paulo", the fact is attributed to Rafael - not to a generic user:self. Use the speaker's real name (e.g., "Rafael", "Ana").
Step 4: Write Your First Facts¶
The write() method takes a natural language message and automatically:
- Extracts entities, facts, and relationships using an LLM
- Resolves entities to canonical records (deduplication)
- Reconciles new facts against existing knowledge
- Upserts the results into the database
async def write_example(memory: MemoryClient):
# First message
result = await memory.write(
agent_id="user_123",
message="My name is Rafael and I live in São Paulo. I work at Acme Corp as a backend engineer.",
speaker_name="Rafael",
)
print(f"Facts added: {len(result.facts_added)}")
for fact in result.facts_added:
print(f" [{fact.entity_name}] {fact.fact_text} (confidence: {fact.confidence})")
# Output:
# [Rafael] Lives in São Paulo (confidence: 0.95)
# [Rafael] Works at Acme Corp as a backend engineer (confidence: 0.95)
# [Acme Corp] Rafael works at Acme Corp (confidence: 0.95)
print(f"Entities resolved: {len(result.entities_resolved)}")
print(f"Duration: {result.duration_ms:.0f}ms")
# Second message — the system recognizes "Rafael" and updates knowledge
result = await memory.write(
agent_id="user_123",
message="I just moved to Rio de Janeiro. Still working at Acme though.",
speaker_name="Rafael",
session_id="onboarding", # optional — defaults to "default"
)
print(f"Facts added: {len(result.facts_added)}")
print(f"Facts updated: {len(result.facts_updated)}") # "lives in São Paulo" → "lives in Rio"
Understanding WriteResult¶
The WriteResult object tells you exactly what happened:
| Field | Type | Description |
|---|---|---|
event_id |
str |
Unique ID for this write event |
facts_added |
list |
New facts created (ADD decisions) |
facts_updated |
list |
Existing facts superseded (UPDATE decisions) |
facts_unchanged |
list |
Facts confirmed but not changed (NOOP decisions) |
facts_deleted |
list |
Facts retracted (DELETE decisions) |
entities_resolved |
list |
Entities identified and resolved |
duration_ms |
float |
Total pipeline duration |
success |
bool |
Whether the pipeline completed without errors |
error |
str \| None |
Error message if the pipeline failed internally |
Step 5: Retrieve Context¶
The retrieve() method finds facts relevant to a query using multiple signals:
async def retrieve_example(memory: MemoryClient):
result = await memory.retrieve(
agent_id="user_123",
query="where does Rafael live and what does he do?",
)
# Option 1: Pre-formatted string — paste directly into your LLM prompt
print(result.context)
# Option 2: Individual scored facts — for programmatic access
for fact in result.facts:
print(f" [{fact.score:.2f}] {fact.entity_name}: {fact.fact_text}")
# Retrieve within a specific session (optional — omit to search all sessions)
session_result = await memory.retrieve(
agent_id="user_123",
query="where does Rafael live?",
session_id="onboarding", # optional — defaults to searching all sessions
)
# With config adjustments (e.g., disable reranker for faster results)
fast_result = await memory.retrieve(
agent_id="user_123",
query="where does Rafael live?",
config_overrides={"enable_reranker": False, "topk_facts": 5},
)
print(f"Total candidates evaluated: {result.total_candidates}")
print(f"Duration: {result.duration_ms:.0f}ms")
.context vs .facts
Use result.context when you just need a string to inject into an LLM prompt - it's pre-formatted with tier labels (CORE MEMORY, EXTENDED CONTEXT, etc.). Use result.facts when you need programmatic access to individual facts, scores, and metadata.
Per-request Config Overrides¶
You can override any MemoryConfig field for a single request without changing the client's default config:
result = await memory.retrieve(
agent_id="user_123",
query="where does Rafael live?",
config_overrides={
"enable_reranker": False,
"topk_facts": 5,
"spreading_activation_hops": 0,
},
)
# config_effective shows the actual config used for this request
print(result.config_effective)
Only the provided keys are overridden; all other fields inherit from the client's MemoryConfig.
Understanding RetrieveResult¶
| Field | Type | Description |
|---|---|---|
facts |
list[ScoredFact] |
Ranked facts with scores |
context |
str |
Pre-formatted context string for LLM prompts |
total_candidates |
int |
Total facts evaluated before ranking |
duration_ms |
float |
Total pipeline duration |
config_effective |
dict |
Effective config values used for this request |
Each ScoredFact contains:
| Field | Type | Description |
|---|---|---|
fact_id |
str |
Unique fact identifier |
entity_name |
str |
Human-readable entity name |
attribute_key |
str |
Fact category/attribute |
fact_text |
str |
The fact content |
score |
float |
Combined relevance score (0-1) |
scores |
dict |
Breakdown by signal (semantic, recency, etc.) |
speaker |
str \| None |
Who spoke the message this fact was extracted from |
Step 6: Managing Facts¶
Beyond write() and retrieve(), the SDK provides CRUD operations for managing individual facts: fetching by ID, listing all facts, and deleting.
Get a specific fact¶
detail = await memory.get(agent_id="user_123", fact_id="some-uuid-here")
if detail:
print(f"[{detail.entity_name}] {detail.fact_text}")
print(f" confidence: {detail.confidence}, importance: {detail.importance}")
print(f" created: {detail.created_at}")
else:
print("Fact not found")
get() returns a FactDetail or None. It fetches any fact by ID — including facts that were soft-deleted by the reconciliation pipeline (i.e. superseded by a newer version). Use it for direct lookups when you have the ID.
List all facts¶
# First page (newest first)
facts = await memory.get_all(agent_id="user_123", limit=50, offset=0)
for fact in facts:
print(f"[{fact.fact_id}] {fact.entity_name}: {fact.fact_text}")
# Next page
page2 = await memory.get_all(agent_id="user_123", limit=50, offset=50)
get_all() returns only active facts (valid_to IS NULL), ordered by created_at descending. Use limit and offset for pagination.
Filtering by entity¶
# Only facts about Ana
ana_facts = await memory.get_all(agent_id="user_123", entity_keys=["person:ana"])
# Facts about Pedro OR Ana
facts = await memory.get_all(agent_id="user_123", entity_keys=["person:pedro", "person:ana"])
The entity_keys filter also works with retrieve():
# Semantic search scoped to facts about Ana
result = await memory.retrieve(
agent_id="user_123",
query="what does she do for work?",
entity_keys=["person:ana"],
)
When entity_keys is provided, only facts linked to at least one of the specified entities are returned (OR logic). Without entity_keys, all facts are searched as before.
Aliases are resolved automatically
entity_keys accepts both canonical keys (person:pedro_menezes) and aliases (person:pedro or just pedro). The SDK resolves aliases against the memory_entity_aliases table before filtering, so you don't need to know the canonical form. Any key that does not resolve is surfaced in result.warnings — retrieve() never returns "zero silently" because of a bad key.
result = await memory.retrieve(
agent_id="user_123",
query="what does she do?",
entity_keys=["person:pedro", "person:unknown"],
)
# result.facts → filtered by Pedro's canonical key (alias resolved)
# result.warnings → ["entity_key 'person:unknown' not found (not canonical, no matching alias)"]
Delete a fact¶
deleted = await memory.delete(agent_id="user_123", fact_id="some-uuid-here")
print(f"Deleted: {deleted}") # True if found and removed, False otherwise
delete() performs a hard delete — the row is physically removed from the database. Associated entity links are removed automatically via cascade. This is the explicit user action ("I want this gone"); the pipeline's soft-delete via valid_to is a separate mechanism for reconciliation.
Delete all facts¶
delete_all() removes every fact belonging to the agent. Use with caution — this is irreversible. Intended for reset/debug scenarios.
List entities¶
entities = await memory.entities(agent_id="user_123", limit=50)
for entity in entities:
print(f"[{entity.entity_type}] {entity.display_name} ({entity.fact_count} facts)")
if entity.summary_text:
print(f" Summary: {entity.summary_text}")
entities() returns active entities ordered by last_seen_at descending. Use it to see what the agent knows about — people, places, organizations, etc.
Understanding FactDetail¶
| Field | Type | Description |
|---|---|---|
fact_id |
str |
Unique fact identifier |
entity_name |
str |
Human-readable entity name |
entity_key |
str |
Canonical entity key |
entity_type |
str |
Entity type (e.g. "person", "organization") |
attribute_key |
str \| None |
Fact category/attribute |
fact_text |
str |
The fact content |
category |
str \| None |
Fact category |
confidence |
float |
Confidence score (0-1) |
importance |
float |
Importance score (0-1) |
valid_from |
datetime \| None |
When the fact became valid |
created_at |
datetime \| None |
When the fact was created |
source_context |
str \| None |
Original context snippet |
speaker |
str \| None |
Who spoke the message this fact was extracted from |
Understanding EntityDetail¶
| Field | Type | Description |
|---|---|---|
entity_id |
str |
Unique entity identifier |
canonical_key |
str |
Canonical entity key (e.g. "person::rafael") |
display_name |
str |
Human-readable entity name |
entity_type |
str |
Entity type (e.g. "person", "organization") |
summary_text |
str \| None |
Auto-generated entity summary |
fact_count |
int |
Number of facts linked to this entity |
importance_score |
float \| None |
Computed importance score |
first_seen_at |
datetime \| None |
When the entity was first mentioned |
last_seen_at |
datetime \| None |
When the entity was last mentioned |
profile_text |
str \| None |
Consolidated entity profile |
Step 7: Configure (Optional)¶
Every aspect of the pipeline is configurable via MemoryConfig:
from arandu import MemoryConfig
from arandu.providers.openai import OpenAIProvider
# Single provider for all LLM operations (extraction, reranker, etc.)
llm = OpenAIProvider(api_key="sk-...", model="gpt-4o")
config = MemoryConfig(
# Tight timeout for real-time chat
extraction_timeout_sec=15.0,
# Tune retrieval
topk_facts=30,
min_similarity=0.25,
enable_reranker=True,
# Custom score weights (default: semantic=0.70, recency=0.20, importance=0.10)
score_weights={
"semantic": 0.60,
"recency": 0.25,
"importance": 0.15,
},
# Set timezone for recency calculations
timezone="America/Sao_Paulo",
)
memory = MemoryClient(
database_url="postgresql+psycopg://memory:memory@localhost/memory",
llm=llm,
embeddings=llm,
config=config,
)
All parameters have sensible defaults - you only need to override what matters for your use case.
Step 8: Debugging with Verbose Mode¶
Pass verbose=True to write() or retrieve() to get a detailed trace of every pipeline step:
result = await memory.write(agent_id="user_123", message="...", speaker_name="Rafael", verbose=True)
# Access the pipeline trace
if result.pipeline:
for step in result.pipeline.steps:
print(f" {step.name}: {step.duration_ms:.1f}ms")
print(f" data: {step.data}")
The trace includes steps like extraction, entity_resolution, reconciliation, and upsert, each with timing and intermediate data. If the pipeline fails internally, an error step is added with the exception details - useful for diagnosing silent failures.
You can serialize the full trace with result.pipeline.to_dict().
Step 9: Cleanup¶
Always close the client when done to release database connections:
Or use it as an async context pattern:
memory = MemoryClient(...)
await memory.initialize()
try:
# ... use memory
finally:
await memory.close()
Complete Example¶
Here's a full working example putting it all together:
import asyncio
from arandu import MemoryClient
from arandu.providers.openai import OpenAIProvider
async def main():
provider = OpenAIProvider(api_key="sk-...")
memory = MemoryClient(
database_url="postgresql+psycopg://memory:memory@localhost:5432/memory",
llm=provider,
embeddings=provider,
)
await memory.initialize()
try:
# Write some facts
await memory.write(
agent_id="user_123",
message="I'm a software engineer living in Berlin. I love cycling and craft coffee.",
speaker_name="Rafael",
)
await memory.write(
agent_id="user_123",
message="My girlfriend Ana is a designer. We adopted a cat named Pixel last month.",
speaker_name="Rafael",
)
# Retrieve context
result = await memory.retrieve(agent_id="user_123", query="tell me about this person")
print(result.context)
# Targeted retrieval
result = await memory.retrieve(agent_id="user_123", query="who is Ana?")
for fact in result.facts:
print(f" [{fact.score:.2f}] {fact.entity_name}: {fact.fact_text}")
finally:
await memory.close()
asyncio.run(main())
Custom Providers¶
arandu uses Python protocols for dependency injection. You can bring any LLM or embedding provider by implementing two simple interfaces:
from arandu.protocols import LLMProvider, LLMResult, EmbeddingProvider
class MyLLMProvider:
async def complete(
self,
messages: list[dict],
temperature: float = 0,
response_format: dict | None = None,
max_tokens: int | None = None,
) -> LLMResult:
# Call your LLM here
text = ... # get response text from your LLM
return LLMResult(text=text, usage=None)
class MyEmbeddingProvider:
async def embed(self, texts: list[str]) -> list[list[float]]:
# Return embeddings for a batch
...
async def embed_one(self, text: str) -> list[float] | None:
# Return embedding for a single text
...
No inheritance required - just implement the methods with the right signatures.
Next Steps¶
- Write Pipeline - Understand how facts are extracted, entities resolved, and knowledge reconciled
- Read Pipeline - Learn how multi-signal retrieval finds the most relevant facts
- Data Types & Schema - Database schema reference (tables, columns, types) for direct SQL queries
- Background Jobs - Set up clustering, consolidation, and importance scoring
- Design Philosophy - Explore the neuroscience-inspired architecture