Data Types Reference¶
This page documents all dataclasses, enums, and result types used across the write pipeline, read pipeline, and background jobs that are not covered in the main API Reference.
Write Pipeline Types¶
InputType¶
Input text classification types, determined by heuristics in classify_input().
| Value | Description |
|---|---|
SHORT |
Less than 500 characters. |
MEDIUM |
500-2000 characters, unstructured. |
LONG |
More than 2000 characters, unstructured. |
STRUCTURED |
More than 500 characters with headers, bullets, or tables. |
ExtractionMode¶
| Value | Description |
|---|---|
SINGLE_SHOT |
Single LLM call for extraction. |
CHUNKED |
Input is split into chunks, each processed separately. |
InputClassification¶
Result of classify_input(). See Write Pipeline API for full field reference.
ExtractionStrategy¶
Result of select_strategy(). See Write Pipeline API for full field reference.
CorrectionResult¶
Result of correction detection.
| Field | Type | Default | Description |
|---|---|---|---|
corrections_detected |
int |
0 |
Number of corrections found. |
corrected_keys |
list[str] |
[] |
Attribute keys that were corrected. |
facts_corrected_ids |
list[str] |
[] |
IDs of old facts that were corrected. |
Read Pipeline Types¶
ExpandedQuery¶
Result of query expansion (entity priming).
| Field | Type | Description |
|---|---|---|
primed_entities |
list[str] |
Entity keys discovered via alias + KG priming. |
temporal_range |
tuple[datetime, datetime] | None |
Resolved date range. |
expanded_terms |
list[str] |
Additional context terms from entity facts. |
PatternQuery¶
A pattern-based query for keyword signal matching.
| Field | Type | Default | Description |
|---|---|---|---|
entity_pattern |
str |
-- | SQL LIKE pattern for entity_key matching. |
attribute_filter |
str | None |
None |
Optional attribute key filter. |
RetrievalPlan¶
Output of the retrieval agent LLM planner. See Read Pipeline API for full field reference.
GraphRetrievalResult¶
Result of graph-based BFS 2-hop retrieval.
| Field | Type | Default | Description |
|---|---|---|---|
facts |
list[dict[str, Any]] |
[] |
Scored fact dicts with source="graph". |
neighbor_keys |
list[str] |
[] |
Entity keys discovered via BFS. |
edges_traversed |
int |
0 |
Total edges examined during BFS. |
edges |
list[dict[str, Any]] |
[] |
Deduplicated edge dicts with display names. |
SpreadingActivationResult¶
Result of spreading activation expansion.
| Field | Type | Default | Description |
|---|---|---|---|
candidates |
list[RetrievalCandidate] |
[] |
Expanded candidates from hop 1-2. |
meta_observations |
list[Any] |
[] |
Relevant meta-observations referencing seed facts. |
entities_explored |
list[str] |
[] |
Entity keys explored during spreading. |
clusters_explored |
list[str] |
[] |
Cluster IDs explored during spreading. |
hop1_count |
int |
0 |
Number of facts found in hop 1. |
hop2_count |
int |
0 |
Number of facts found in hop 2. |
kg_relationships_explored |
int |
0 |
Number of KG relationships traversed. |
CompressedContext¶
Result of context compression (tiered hot/warm/cold).
| Field | Type | Default | Description |
|---|---|---|---|
context_text |
str |
"" |
Final prompt-ready context string. |
hot_count |
int |
0 |
Number of facts in hot tier (Tier 1). |
warm_count |
int |
0 |
Number of facts in warm tier (Tier 2). |
cold_count |
int |
0 |
Number of items in cold tier (Tier 3). |
total_tokens |
int |
0 |
Estimated token count of context_text. |
EmotionalTrendsResult¶
Result of emotional trend materialization.
| Field | Type | Default | Description |
|---|---|---|---|
emotion_counts |
dict[str, int] |
{} |
Mapping of emotion to occurrence count. |
trend_direction |
str |
"stable" |
"increasing", "decreasing", or "stable". |
dominant_emotion |
str | None |
None |
Most frequent emotion. |
trigger_keywords |
list[str] |
[] |
Top keywords from high-intensity events. |
avg_intensity |
float |
0.0 |
Average emotion intensity. |
dominant_intensity |
float |
0.0 |
Average intensity of the dominant emotion. |
dominant_energy |
str |
"medium" |
Predominant energy level. |
events_analyzed |
int |
0 |
Number of events analyzed. |
observation_created |
bool |
False |
Whether a meta-observation was created/updated. |
observation_id |
str | None |
None |
ID of the created/updated observation. |
DirectiveBlock¶
Result of directive generation (procedural memory).
| Field | Type | Default | Description |
|---|---|---|---|
text |
str |
"" |
Cohesive behavioral instructions block. |
directive_count |
int |
0 |
Number of active directives used. |
cache_hit |
bool |
False |
Whether this was served from cache. |
ContradictionResult¶
Result of contradiction check between directives.
| Field | Type | Default | Description |
|---|---|---|---|
has_contradiction |
bool |
False |
Whether a contradiction was found. |
conflicting_directive |
str | None |
None |
Title of the conflicting directive. |
resolution |
str | None |
None |
Explanation of the resolution. |
Background Job Result Types¶
ClusteringResult¶
Result of fact clustering.
| Field | Type | Default | Description |
|---|---|---|---|
clusters_created |
int |
0 |
Number of new clusters created. |
clusters_reinforced |
int |
0 |
Number of existing clusters updated. |
summaries_generated |
int |
0 |
Number of cluster summaries generated via LLM. |
facts_assigned |
int |
0 |
Number of facts assigned to clusters. |
CommunityDetectionResult¶
Result of cross-entity community detection.
| Field | Type | Default | Description |
|---|---|---|---|
communities_created |
int |
0 |
New community observations created. |
communities_reinforced |
int |
0 |
Existing community observations reinforced. |
clusters_in_communities |
int |
0 |
Total clusters assigned to communities. |
skipped |
bool |
False |
Whether detection was skipped. |
skip_reason |
str | None |
None |
Reason for skipping. |
ConsolidationResult¶
Result of L2/L3 consolidation.
| Field | Type | Default | Description |
|---|---|---|---|
events_processed |
int |
0 |
Number of events analyzed. |
observations_created |
int |
0 |
New meta-observations created. |
observations_reinforced |
int |
0 |
Existing observations reinforced. |
skipped |
bool |
False |
Whether consolidation was skipped. |
skip_reason |
str | None |
None |
Reason for skipping. |
MemifyResult¶
Result of the memify pipeline (vitality scoring, staleness marking, edge management).
| Field | Type | Default | Description |
|---|---|---|---|
facts_scored |
int |
0 |
Number of facts scored for vitality. |
facts_marked_stale |
int |
0 |
Number of facts marked as stale. |
edges_reinforced |
int |
0 |
Number of KG edges reinforced. |
merges_executed |
int |
0 |
Number of entity merges executed. |
EntityImportanceResult¶
Result of entity importance scoring (sleep-time compute).
| Field | Type | Default | Description |
|---|---|---|---|
entities_scored |
int |
0 |
Number of entities scored. |
top_entities |
list[tuple[str, float]] |
[] |
Top entities by score (key, score) pairs. |
SummaryRefreshResult¶
Result of entity summary refresh (sleep-time compute).
| Field | Type | Default | Description |
|---|---|---|---|
summaries_refreshed |
int |
0 |
Number of summaries generated. |
summaries_skipped |
int |
0 |
Number of entities skipped. |
Background Functions¶
tag_event_emotion¶
Infer emotion, intensity, and energy from event text via LLM.
| Parameter | Type | Description |
|---|---|---|
event_text |
str |
Text to analyze. |
llm |
LLMProvider |
Injected LLM provider. |
Returns: Dict with emotion, intensity, energy keys, or None on failure.
from arandu.background import tag_event_emotion
result = await tag_event_emotion("I'm so happy today!", llm)
# {"emotion": "joy", "intensity": 0.85, "energy": "high"}
Database Models¶
The SQLAlchemy models below define the persistence layer. They live in arandu.models and are useful for advanced queries executed directly against the database.
erDiagram
MemoryEvent ||--o{ MemoryFact : "source_event_id"
MemoryFact ||--o{ MemoryFactEntityLink : "fact_id"
MemoryFact ||--o| MemoryCluster : "cluster_id"
MemoryFact ||--o| MemoryFact : "supersedes_fact_id"
MemoryEntity ||--o{ MemoryEntityAlias : "canonical_entity_key"
MemoryEntity ||--o{ MemoryFactEntityLink : "entity_key"
MemoryEntity ||--o{ MemoryEntityRelationship : "source/target"
MemoryEntityRelationship ||--o| MemoryFact : "evidence_fact_id"
MemoryFact }o--o| MemoryAttributeRegistry : "attribute_key"
MemoryIntention }o--|| MemoryEvent : "agent_id"
SessionObservation }o--|| MemoryEvent : "agent_id"
MemoryEvent {
UUID id PK
Text agent_id
DateTime occurred_at
Text text
Vector embedding_vec
}
MemoryFact {
UUID id PK
Text agent_id
String entity_key
Text fact_text
Float confidence
DateTime valid_from
DateTime valid_to
Vector embedding_vec
}
MemoryEntity {
UUID id PK
Text agent_id
String canonical_key UK
String display_name
String entity_type
Text summary_text
Float importance_score
}
MemoryFactEntityLink {
UUID id PK
UUID fact_id FK
String entity_key
Boolean is_primary
Text agent_id
}
MemoryEntityRelationship {
UUID id PK
Text agent_id
String source_entity_key
String target_entity_key
String rel_type
Float strength
UUID evidence_fact_id FK
}
MemoryEntityAlias {
UUID id PK
Text agent_id
String alias UK
String canonical_entity_key
}
MemoryCluster {
UUID id PK
Text agent_id
String label
Text summary_text
Vector embedding_vec
}
MemoryMetaObservation {
UUID id PK
Text agent_id
String observation_type
Text text
Float confidence
}
MemoryAttributeRegistry {
UUID id PK
String key UK
String status
String value_type
Integer seen_count
}
MemoryIntention {
UUID id PK
Text agent_id
String trigger_type
Text trigger_condition
Text intended_action
String status
}
SessionObservation {
UUID id PK
Text agent_id
Text content
String topic
Boolean is_active
}
MemoryFact¶
Versioned fact ledger - stores structured facts with validity windows.
| Column | Type | Description |
|---|---|---|
id |
UUID |
Primary key. |
agent_id |
Text |
Owner agent ID (any string: UUID, slug, numeric, etc.). |
entity_type |
String |
Entity type (e.g. "person"). |
entity_key |
String |
Canonical entity key (e.g. "person:ana"). |
entity_name |
String? |
Human-readable entity name. |
attribute_key |
String? |
Attribute key (e.g. "occupation"). |
fact_text |
Text |
Natural-language fact sentence. |
category |
String(50)? |
Fact category. |
confidence |
Float |
Confidence score (default 0.8). |
importance |
Float |
Importance score (default 0.5). |
is_sensitive |
Boolean |
Whether the fact contains sensitive data. |
valid_from |
DateTime |
Start of the validity window. |
valid_to |
DateTime? |
End of validity (NULL = currently active). |
ttl_days |
Integer? |
Optional time-to-live in days. |
source_event_id |
UUID? |
FK to MemoryEvent. |
supersedes_fact_id |
UUID? |
ID of the fact this one replaces. |
embedding_vec |
Vector(1536) |
pgvector embedding for semantic search. |
vitality_score |
Float? |
Sleep-time vitality score. |
is_stale |
Boolean |
Whether marked stale by memify. |
cluster_id |
UUID? |
FK to MemoryCluster. |
value_json |
JSONB? |
Structured value (JSON) for the attribute. |
needs_confirmation |
Boolean |
Whether the fact requires user confirmation. |
last_confirmed_at |
DateTime? |
When the fact was last confirmed by the user. |
times_retrieved |
Integer |
How many times this fact was retrieved. |
last_retrieved_at |
DateTime? |
When the fact was last retrieved. |
user_correction_count |
Integer |
Number of times the user corrected this fact. |
source_context |
String(512)? |
Source context of the original input. |
agent_annotation |
Text? |
Free-text annotation added by the agent. |
embedding |
JSONB? |
Raw embedding as JSON (non-pgvector fallback). |
search_vector |
TSVECTOR |
Full-text search index column. |
created_at |
DateTime |
Row creation timestamp. |
ingested_at |
DateTime |
Bi-temporal ingestion timestamp. |
invalidated_at |
DateTime? |
When the fact was invalidated (bi-temporal). |
MemoryEntity¶
First-class entity node in the knowledge graph.
| Column | Type | Description |
|---|---|---|
id |
UUID |
Primary key. |
agent_id |
Text |
Owner agent ID (any string: UUID, slug, numeric, etc.). |
canonical_key |
String(128) |
Unique canonical key (e.g. "person:ana"). |
display_name |
String(256)? |
Human-readable display name. |
entity_type |
String(32) |
Type ("person", "organization", etc.). |
summary_text |
Text? |
LLM-generated entity summary. |
embedding_vec |
Vector(1536) |
Entity embedding. |
fact_count |
Integer |
Number of linked facts. |
importance_score |
Float? |
Sleep-time importance score. |
is_active |
Boolean |
Whether the entity is active. |
first_seen_at |
DateTime |
When the entity was first observed. |
last_seen_at |
DateTime |
When the entity was last observed. |
summary_refreshed_at |
DateTime? |
When the entity summary was last refreshed. |
created_at |
DateTime |
Row creation timestamp. |
Unique constraint: (agent_id, canonical_key).
MemoryEntityAlias¶
Maps alias names to canonical entity keys for entity resolution.
| Column | Type | Description |
|---|---|---|
id |
UUID |
Primary key. |
agent_id |
Text |
Owner agent ID (any string: UUID, slug, numeric, etc.). |
alias |
String |
Alias text (e.g. "Ana"). |
canonical_entity_key |
String |
Target canonical key. |
canonical_entity_type |
String |
Target entity type. |
created_at |
DateTime |
Row creation timestamp. |
Unique constraint: (agent_id, alias).
MemoryEntityRelationship¶
Directed edge between two entities in the knowledge graph.
| Column | Type | Description |
|---|---|---|
id |
UUID |
Primary key. |
agent_id |
Text |
Owner agent ID (any string: UUID, slug, numeric, etc.). |
source_entity_key |
String(128) |
Source entity canonical key. |
target_entity_key |
String(128) |
Target entity canonical key. |
rel_type |
String(64) |
Relationship type (e.g. "works_at", "mentored_by"). |
strength |
Float |
Edge strength (default 0.8). |
evidence_fact_id |
UUID? |
FK to the fact that evidences this edge. |
provenance |
String(16) |
How the edge was created ("rule", "llm"). |
valid_from |
DateTime |
Start of validity. |
valid_to |
DateTime? |
End of validity (NULL = active). |
created_at |
DateTime |
Row creation timestamp. |
updated_at |
DateTime |
Last update timestamp. |
last_used_at |
DateTime? |
When the relationship was last used in retrieval. |
invalidated_at |
DateTime? |
When the relationship was invalidated. |
Unique constraint: (agent_id, source_entity_key, target_entity_key, rel_type).
Dynamic Relationship Types¶
The rel_type field accepts any short, descriptive snake_case string - it is not restricted to a fixed set. The extraction pipeline instructs the LLM to choose the most descriptive type for each relationship.
Common types (used as examples in the extraction prompt, not as restrictions):
works_at, manages, reports_to, family_of, friend_of, partner_of, owns, lives_in, member_of, studies_at, works_with
The LLM may also produce types like mentored_by, inspired_by, competed_with, or any other descriptive type.
Normalization: All relationship types are normalized via normalize_rel_type() before persistence:
- Lowercase + underscores (e.g.
"Mentored By"→"mentored_by") - Known aliases are mapped to common types (e.g.
"boss"→"reports_to","spouse"→"partner_of") - Unknown types pass through after sanitization
The CANONICAL_REL_TYPES set in arandu.constants is available as a reference for consumers who want to filter by known types, but it is not used as a validation filter.
See Evidence Linkage & Cascade Invalidation for how relationships are linked to supporting facts and automatically cleaned up when facts change.
MemoryEvent¶
Immutable event log - stores all user messages with embeddings.
| Column | Type | Description |
|---|---|---|
id |
UUID |
Primary key. |
agent_id |
Text |
Owner agent ID (any string: UUID, slug, numeric, etc.). |
occurred_at |
DateTime |
When the event happened. |
text |
Text |
Event text content. |
source |
String |
Origin (default "api"). |
importance |
Float |
Importance score (default 0.5). |
embedding_vec |
Vector(1536) |
Event embedding for retrieval. |
embedding |
JSONB? |
Raw embedding as JSON (non-pgvector fallback). |
trace_json |
JSONB? |
Trace/debug metadata for the event. |
created_at |
DateTime |
Row creation timestamp. |
emotion_primary |
String(32)? |
Primary emotion label. |
emotion_intensity |
Float? |
Emotion intensity (0-1). |
energy_level |
String(16)? |
Energy level ("low", "medium", "high"). |
MemoryCluster¶
Semantic cluster grouping related facts for richer context.
| Column | Type | Description |
|---|---|---|
id |
UUID |
Primary key. |
agent_id |
Text |
Owner agent ID (any string: UUID, slug, numeric, etc.). |
label |
String(128) |
Cluster label. |
summary_text |
Text? |
LLM-generated cluster summary. |
cluster_type |
String(32) |
Cluster type (default "auto"). |
fact_count |
Integer |
Number of facts in the cluster. |
importance |
Float |
Cluster importance (default 0.5). |
embedding_vec |
Vector(1536) |
Cluster embedding. |
is_active |
Boolean |
Whether the cluster is active. |
last_updated_at |
DateTime |
When the cluster was last updated. |
created_at |
DateTime |
Row creation timestamp. |
MemoryMetaObservation¶
Meta-observations derived from consolidation - patterns, insights, trends.
| Column | Type | Description |
|---|---|---|
id |
UUID |
Primary key. |
agent_id |
Text |
Owner agent ID (any string: UUID, slug, numeric, etc.). |
observation_type |
String(32) |
Type ("pattern", "trend", "community", etc.). |
title |
String(256) |
Short title. |
text |
Text |
Full observation text. |
supporting_event_ids |
JSONB |
List of supporting event UUIDs. |
supporting_fact_ids |
JSONB |
List of supporting fact UUIDs. |
confidence |
Float |
Confidence (default 0.7). |
importance |
Float |
Importance (default 0.5). |
times_reinforced |
Integer |
How many times this observation was reinforced. |
is_active |
Boolean |
Whether the observation is active. |
embedding_vec |
Vector(1536) |
Observation embedding. |
first_detected_at |
DateTime |
When the observation was first detected. |
last_reinforced_at |
DateTime |
When the observation was last reinforced. |
created_at |
DateTime |
Row creation timestamp. |
MemoryFactEntityLink¶
Cross-reference table linking each fact to ALL entities it mentions - enables cross-entity retrieval without fact duplication.
| Column | Type | Description |
|---|---|---|
id |
UUID |
Primary key. |
fact_id |
UUID |
FK to MemoryFact. |
entity_key |
String |
Entity canonical key (e.g. "person:clara_rezende"). |
is_primary |
Boolean |
Whether this entity is the fact's primary subject. |
agent_id |
Text |
Owner agent ID. |
Unique constraint: (fact_id, entity_key).
Indexes: (agent_id, entity_key) for retrieval, (fact_id) for cascade deletes.
MemoryAttributeRegistry¶
Registry for managing attribute keys - tracks proposed vs active keys.
| Column | Type | Description |
|---|---|---|
id |
UUID |
Primary key. |
key |
String(64) |
Unique attribute key. |
status |
String(20) |
"proposed" or "active". |
value_type |
String(20) |
Expected value type (default "string"). |
conflict_policy |
String(20) |
How to handle conflicts (default "supersede"). |
ttl_days |
Integer? |
Optional default TTL for facts with this key. |
seen_count |
Integer |
How many times this key has been seen. |
proposed_by |
String(20) |
Who proposed the key ("llm", "user"). |
reason |
Text? |
Why the key was proposed. |
first_seen_at |
DateTime |
When the attribute key was first seen. |
last_seen_at |
DateTime |
When the attribute key was last seen. |
example_raw_key |
String(128)? |
Example of the raw key before normalization. |
created_at |
DateTime |
Row creation timestamp. |
updated_at |
DateTime? |
Last update timestamp. |
MemoryIntention¶
Prospective memory -- future intentions with time-based or event-based triggers.
| Column | Type | Description |
|---|---|---|
id |
UUID |
Primary key. |
agent_id |
Text |
Owner agent ID (any string: UUID, slug, numeric, etc.). |
trigger_type |
String(16) |
Trigger type ("time", "event", etc.). |
trigger_condition |
Text |
Natural-language description of the trigger condition. |
intended_action |
Text |
What the agent should do when triggered. |
due_date |
DateTime? |
Optional due date for time-based triggers. |
status |
String(16) |
Status ("pending", "triggered", "fulfilled", "expired"). Default "pending". |
trigger_embedding_vec |
Vector(1536) |
Embedding of the trigger condition for semantic matching. |
source_context |
String(32)? |
Source context identifier. |
outcome_note |
Text? |
Note about the outcome after fulfillment. |
created_at |
DateTime |
Row creation timestamp. |
triggered_at |
DateTime? |
When the intention was triggered. |
fulfilled_at |
DateTime? |
When the intention was fulfilled. |
SessionObservation¶
Persistent session observations created by the LLM-driven Observer -- captures in-session context that persists across turns.
| Column | Type | Description |
|---|---|---|
id |
UUID |
Primary key. |
agent_id |
Text |
Owner agent ID (any string: UUID, slug, numeric, etc.). |
content |
Text |
Observation content text. |
topic |
String(64)? |
Topic tag for the observation. |
entities_mentioned |
JSONB |
List of entity keys mentioned in the observation. |
created_at |
DateTime |
Row creation timestamp. |
referenced_at |
DateTime? |
When the observation was last referenced. |
relative_offset |
String(64)? |
Relative time offset descriptor (e.g. "2 messages ago"). |
source_message_ids |
JSONB |
List of source message IDs that originated this observation. |
is_active |
Boolean |
Whether the observation is active. |
merged_into_id |
UUID? |
ID of the observation this one was merged into. |
emotion_label |
String(32)? |
Detected emotion label for the session context. |
embedding_vec |
Vector(1536) |
Observation embedding for semantic retrieval. |