Skip to content

Data Types Reference

This page documents all dataclasses, enums, and result types used across the write pipeline, read pipeline, and background jobs that are not covered in the main API Reference.


Write Pipeline Types

InputType

class InputType(str, Enum)

Input text classification types, determined by heuristics in classify_input().

Value Description
SHORT Less than 500 characters.
MEDIUM 500-2000 characters, unstructured.
LONG More than 2000 characters, unstructured.
STRUCTURED More than 500 characters with headers, bullets, or tables.

ExtractionMode

class ExtractionMode(str, Enum)
Value Description
SINGLE_SHOT Single LLM call for extraction.
CHUNKED Input is split into chunks, each processed separately.

InputClassification

@dataclass
class InputClassification

Result of classify_input(). See Write Pipeline API for full field reference.

ExtractionStrategy

@dataclass
class ExtractionStrategy

Result of select_strategy(). See Write Pipeline API for full field reference.

CorrectionResult

@dataclass
class CorrectionResult

Result of correction detection.

Field Type Default Description
corrections_detected int 0 Number of corrections found.
corrected_keys list[str] [] Attribute keys that were corrected.
facts_corrected_ids list[str] [] IDs of old facts that were corrected.

Read Pipeline Types

ExpandedQuery

@dataclass
class ExpandedQuery

Result of query expansion (entity priming).

Field Type Description
primed_entities list[str] Entity keys discovered via alias + KG priming.
temporal_range tuple[datetime, datetime] | None Resolved date range.
expanded_terms list[str] Additional context terms from entity facts.

PatternQuery

@dataclass
class PatternQuery

A pattern-based query for keyword signal matching.

Field Type Default Description
entity_pattern str -- SQL LIKE pattern for entity_key matching.
attribute_filter str | None None Optional attribute key filter.

RetrievalPlan

@dataclass
class RetrievalPlan

Output of the retrieval agent LLM planner. See Read Pipeline API for full field reference.

GraphRetrievalResult

@dataclass
class GraphRetrievalResult

Result of graph-based BFS 2-hop retrieval.

Field Type Default Description
facts list[dict[str, Any]] [] Scored fact dicts with source="graph".
neighbor_keys list[str] [] Entity keys discovered via BFS.
edges_traversed int 0 Total edges examined during BFS.
edges list[dict[str, Any]] [] Deduplicated edge dicts with display names.

SpreadingActivationResult

@dataclass
class SpreadingActivationResult

Result of spreading activation expansion.

Field Type Default Description
candidates list[RetrievalCandidate] [] Expanded candidates from hop 1-2.
meta_observations list[Any] [] Relevant meta-observations referencing seed facts.
entities_explored list[str] [] Entity keys explored during spreading.
clusters_explored list[str] [] Cluster IDs explored during spreading.
hop1_count int 0 Number of facts found in hop 1.
hop2_count int 0 Number of facts found in hop 2.
kg_relationships_explored int 0 Number of KG relationships traversed.

CompressedContext

@dataclass
class CompressedContext

Result of context compression (tiered hot/warm/cold).

Field Type Default Description
context_text str "" Final prompt-ready context string.
hot_count int 0 Number of facts in hot tier (Tier 1).
warm_count int 0 Number of facts in warm tier (Tier 2).
cold_count int 0 Number of items in cold tier (Tier 3).
total_tokens int 0 Estimated token count of context_text.

EmotionalTrendsResult

@dataclass
class EmotionalTrendsResult

Result of emotional trend materialization.

Field Type Default Description
emotion_counts dict[str, int] {} Mapping of emotion to occurrence count.
trend_direction str "stable" "increasing", "decreasing", or "stable".
dominant_emotion str | None None Most frequent emotion.
trigger_keywords list[str] [] Top keywords from high-intensity events.
avg_intensity float 0.0 Average emotion intensity.
dominant_intensity float 0.0 Average intensity of the dominant emotion.
dominant_energy str "medium" Predominant energy level.
events_analyzed int 0 Number of events analyzed.
observation_created bool False Whether a meta-observation was created/updated.
observation_id str | None None ID of the created/updated observation.

DirectiveBlock

@dataclass
class DirectiveBlock

Result of directive generation (procedural memory).

Field Type Default Description
text str "" Cohesive behavioral instructions block.
directive_count int 0 Number of active directives used.
cache_hit bool False Whether this was served from cache.

ContradictionResult

@dataclass
class ContradictionResult

Result of contradiction check between directives.

Field Type Default Description
has_contradiction bool False Whether a contradiction was found.
conflicting_directive str | None None Title of the conflicting directive.
resolution str | None None Explanation of the resolution.

Background Job Result Types

ClusteringResult

@dataclass
class ClusteringResult

Result of fact clustering.

Field Type Default Description
clusters_created int 0 Number of new clusters created.
clusters_reinforced int 0 Number of existing clusters updated.
summaries_generated int 0 Number of cluster summaries generated via LLM.
facts_assigned int 0 Number of facts assigned to clusters.

CommunityDetectionResult

@dataclass
class CommunityDetectionResult

Result of cross-entity community detection.

Field Type Default Description
communities_created int 0 New community observations created.
communities_reinforced int 0 Existing community observations reinforced.
clusters_in_communities int 0 Total clusters assigned to communities.
skipped bool False Whether detection was skipped.
skip_reason str | None None Reason for skipping.

ConsolidationResult

@dataclass
class ConsolidationResult

Result of L2/L3 consolidation.

Field Type Default Description
events_processed int 0 Number of events analyzed.
observations_created int 0 New meta-observations created.
observations_reinforced int 0 Existing observations reinforced.
skipped bool False Whether consolidation was skipped.
skip_reason str | None None Reason for skipping.

MemifyResult

@dataclass
class MemifyResult

Result of the memify pipeline (vitality scoring, staleness marking, edge management).

Field Type Default Description
facts_scored int 0 Number of facts scored for vitality.
facts_marked_stale int 0 Number of facts marked as stale.
edges_reinforced int 0 Number of KG edges reinforced.
merges_executed int 0 Number of entity merges executed.

EntityImportanceResult

@dataclass
class EntityImportanceResult

Result of entity importance scoring (sleep-time compute).

Field Type Default Description
entities_scored int 0 Number of entities scored.
top_entities list[tuple[str, float]] [] Top entities by score (key, score) pairs.

SummaryRefreshResult

@dataclass
class SummaryRefreshResult

Result of entity summary refresh (sleep-time compute).

Field Type Default Description
summaries_refreshed int 0 Number of summaries generated.
summaries_skipped int 0 Number of entities skipped.

Background Functions

tag_event_emotion

Infer emotion, intensity, and energy from event text via LLM.

async def tag_event_emotion(
    event_text: str,
    llm: LLMProvider,
) -> dict[str, Any] | None
Parameter Type Description
event_text str Text to analyze.
llm LLMProvider Injected LLM provider.

Returns: Dict with emotion, intensity, energy keys, or None on failure.

from arandu.background import tag_event_emotion

result = await tag_event_emotion("I'm so happy today!", llm)
# {"emotion": "joy", "intensity": 0.85, "energy": "high"}

Database Models

The SQLAlchemy models below define the persistence layer. They live in arandu.models and are useful for advanced queries executed directly against the database.

erDiagram
    MemoryEvent ||--o{ MemoryFact : "source_event_id"
    MemoryFact ||--o{ MemoryFactEntityLink : "fact_id"
    MemoryFact ||--o| MemoryCluster : "cluster_id"
    MemoryFact ||--o| MemoryFact : "supersedes_fact_id"
    MemoryEntity ||--o{ MemoryEntityAlias : "canonical_entity_key"
    MemoryEntity ||--o{ MemoryFactEntityLink : "entity_key"
    MemoryEntity ||--o{ MemoryEntityRelationship : "source/target"
    MemoryEntityRelationship ||--o| MemoryFact : "evidence_fact_id"
    MemoryFact }o--o| MemoryAttributeRegistry : "attribute_key"
    MemoryIntention }o--|| MemoryEvent : "agent_id"
    SessionObservation }o--|| MemoryEvent : "agent_id"

    MemoryEvent {
        UUID id PK
        Text agent_id
        DateTime occurred_at
        Text text
        Vector embedding_vec
    }
    MemoryFact {
        UUID id PK
        Text agent_id
        String entity_key
        Text fact_text
        Float confidence
        DateTime valid_from
        DateTime valid_to
        Vector embedding_vec
    }
    MemoryEntity {
        UUID id PK
        Text agent_id
        String canonical_key UK
        String display_name
        String entity_type
        Text summary_text
        Float importance_score
    }
    MemoryFactEntityLink {
        UUID id PK
        UUID fact_id FK
        String entity_key
        Boolean is_primary
        Text agent_id
    }
    MemoryEntityRelationship {
        UUID id PK
        Text agent_id
        String source_entity_key
        String target_entity_key
        String rel_type
        Float strength
        UUID evidence_fact_id FK
    }
    MemoryEntityAlias {
        UUID id PK
        Text agent_id
        String alias UK
        String canonical_entity_key
    }
    MemoryCluster {
        UUID id PK
        Text agent_id
        String label
        Text summary_text
        Vector embedding_vec
    }
    MemoryMetaObservation {
        UUID id PK
        Text agent_id
        String observation_type
        Text text
        Float confidence
    }
    MemoryAttributeRegistry {
        UUID id PK
        String key UK
        String status
        String value_type
        Integer seen_count
    }
    MemoryIntention {
        UUID id PK
        Text agent_id
        String trigger_type
        Text trigger_condition
        Text intended_action
        String status
    }
    SessionObservation {
        UUID id PK
        Text agent_id
        Text content
        String topic
        Boolean is_active
    }

MemoryFact

Versioned fact ledger - stores structured facts with validity windows.

Column Type Description
id UUID Primary key.
agent_id Text Owner agent ID (any string: UUID, slug, numeric, etc.).
entity_type String Entity type (e.g. "person").
entity_key String Canonical entity key (e.g. "person:ana").
entity_name String? Human-readable entity name.
attribute_key String? Attribute key (e.g. "occupation").
fact_text Text Natural-language fact sentence.
category String(50)? Fact category.
confidence Float Confidence score (default 0.8).
importance Float Importance score (default 0.5).
is_sensitive Boolean Whether the fact contains sensitive data.
valid_from DateTime Start of the validity window.
valid_to DateTime? End of validity (NULL = currently active).
ttl_days Integer? Optional time-to-live in days.
source_event_id UUID? FK to MemoryEvent.
supersedes_fact_id UUID? ID of the fact this one replaces.
embedding_vec Vector(1536) pgvector embedding for semantic search.
vitality_score Float? Sleep-time vitality score.
is_stale Boolean Whether marked stale by memify.
cluster_id UUID? FK to MemoryCluster.
value_json JSONB? Structured value (JSON) for the attribute.
needs_confirmation Boolean Whether the fact requires user confirmation.
last_confirmed_at DateTime? When the fact was last confirmed by the user.
times_retrieved Integer How many times this fact was retrieved.
last_retrieved_at DateTime? When the fact was last retrieved.
user_correction_count Integer Number of times the user corrected this fact.
source_context String(512)? Source context of the original input.
agent_annotation Text? Free-text annotation added by the agent.
embedding JSONB? Raw embedding as JSON (non-pgvector fallback).
search_vector TSVECTOR Full-text search index column.
created_at DateTime Row creation timestamp.
ingested_at DateTime Bi-temporal ingestion timestamp.
invalidated_at DateTime? When the fact was invalidated (bi-temporal).

MemoryEntity

First-class entity node in the knowledge graph.

Column Type Description
id UUID Primary key.
agent_id Text Owner agent ID (any string: UUID, slug, numeric, etc.).
canonical_key String(128) Unique canonical key (e.g. "person:ana").
display_name String(256)? Human-readable display name.
entity_type String(32) Type ("person", "organization", etc.).
summary_text Text? LLM-generated entity summary.
embedding_vec Vector(1536) Entity embedding.
fact_count Integer Number of linked facts.
importance_score Float? Sleep-time importance score.
is_active Boolean Whether the entity is active.
first_seen_at DateTime When the entity was first observed.
last_seen_at DateTime When the entity was last observed.
summary_refreshed_at DateTime? When the entity summary was last refreshed.
created_at DateTime Row creation timestamp.

Unique constraint: (agent_id, canonical_key).

MemoryEntityAlias

Maps alias names to canonical entity keys for entity resolution.

Column Type Description
id UUID Primary key.
agent_id Text Owner agent ID (any string: UUID, slug, numeric, etc.).
alias String Alias text (e.g. "Ana").
canonical_entity_key String Target canonical key.
canonical_entity_type String Target entity type.
created_at DateTime Row creation timestamp.

Unique constraint: (agent_id, alias).

MemoryEntityRelationship

Directed edge between two entities in the knowledge graph.

Column Type Description
id UUID Primary key.
agent_id Text Owner agent ID (any string: UUID, slug, numeric, etc.).
source_entity_key String(128) Source entity canonical key.
target_entity_key String(128) Target entity canonical key.
rel_type String(64) Relationship type (e.g. "works_at", "mentored_by").
strength Float Edge strength (default 0.8).
evidence_fact_id UUID? FK to the fact that evidences this edge.
provenance String(16) How the edge was created ("rule", "llm").
valid_from DateTime Start of validity.
valid_to DateTime? End of validity (NULL = active).
created_at DateTime Row creation timestamp.
updated_at DateTime Last update timestamp.
last_used_at DateTime? When the relationship was last used in retrieval.
invalidated_at DateTime? When the relationship was invalidated.

Unique constraint: (agent_id, source_entity_key, target_entity_key, rel_type).

Dynamic Relationship Types

The rel_type field accepts any short, descriptive snake_case string - it is not restricted to a fixed set. The extraction pipeline instructs the LLM to choose the most descriptive type for each relationship.

Common types (used as examples in the extraction prompt, not as restrictions):

works_at, manages, reports_to, family_of, friend_of, partner_of, owns, lives_in, member_of, studies_at, works_with

The LLM may also produce types like mentored_by, inspired_by, competed_with, or any other descriptive type.

Normalization: All relationship types are normalized via normalize_rel_type() before persistence:

  • Lowercase + underscores (e.g. "Mentored By""mentored_by")
  • Known aliases are mapped to common types (e.g. "boss""reports_to", "spouse""partner_of")
  • Unknown types pass through after sanitization

The CANONICAL_REL_TYPES set in arandu.constants is available as a reference for consumers who want to filter by known types, but it is not used as a validation filter.

See Evidence Linkage & Cascade Invalidation for how relationships are linked to supporting facts and automatically cleaned up when facts change.

MemoryEvent

Immutable event log - stores all user messages with embeddings.

Column Type Description
id UUID Primary key.
agent_id Text Owner agent ID (any string: UUID, slug, numeric, etc.).
occurred_at DateTime When the event happened.
text Text Event text content.
source String Origin (default "api").
importance Float Importance score (default 0.5).
embedding_vec Vector(1536) Event embedding for retrieval.
embedding JSONB? Raw embedding as JSON (non-pgvector fallback).
trace_json JSONB? Trace/debug metadata for the event.
created_at DateTime Row creation timestamp.
emotion_primary String(32)? Primary emotion label.
emotion_intensity Float? Emotion intensity (0-1).
energy_level String(16)? Energy level ("low", "medium", "high").

MemoryCluster

Semantic cluster grouping related facts for richer context.

Column Type Description
id UUID Primary key.
agent_id Text Owner agent ID (any string: UUID, slug, numeric, etc.).
label String(128) Cluster label.
summary_text Text? LLM-generated cluster summary.
cluster_type String(32) Cluster type (default "auto").
fact_count Integer Number of facts in the cluster.
importance Float Cluster importance (default 0.5).
embedding_vec Vector(1536) Cluster embedding.
is_active Boolean Whether the cluster is active.
last_updated_at DateTime When the cluster was last updated.
created_at DateTime Row creation timestamp.

MemoryMetaObservation

Meta-observations derived from consolidation - patterns, insights, trends.

Column Type Description
id UUID Primary key.
agent_id Text Owner agent ID (any string: UUID, slug, numeric, etc.).
observation_type String(32) Type ("pattern", "trend", "community", etc.).
title String(256) Short title.
text Text Full observation text.
supporting_event_ids JSONB List of supporting event UUIDs.
supporting_fact_ids JSONB List of supporting fact UUIDs.
confidence Float Confidence (default 0.7).
importance Float Importance (default 0.5).
times_reinforced Integer How many times this observation was reinforced.
is_active Boolean Whether the observation is active.
embedding_vec Vector(1536) Observation embedding.
first_detected_at DateTime When the observation was first detected.
last_reinforced_at DateTime When the observation was last reinforced.
created_at DateTime Row creation timestamp.

Cross-reference table linking each fact to ALL entities it mentions - enables cross-entity retrieval without fact duplication.

Column Type Description
id UUID Primary key.
fact_id UUID FK to MemoryFact.
entity_key String Entity canonical key (e.g. "person:clara_rezende").
is_primary Boolean Whether this entity is the fact's primary subject.
agent_id Text Owner agent ID.

Unique constraint: (fact_id, entity_key).

Indexes: (agent_id, entity_key) for retrieval, (fact_id) for cascade deletes.

MemoryAttributeRegistry

Registry for managing attribute keys - tracks proposed vs active keys.

Column Type Description
id UUID Primary key.
key String(64) Unique attribute key.
status String(20) "proposed" or "active".
value_type String(20) Expected value type (default "string").
conflict_policy String(20) How to handle conflicts (default "supersede").
ttl_days Integer? Optional default TTL for facts with this key.
seen_count Integer How many times this key has been seen.
proposed_by String(20) Who proposed the key ("llm", "user").
reason Text? Why the key was proposed.
first_seen_at DateTime When the attribute key was first seen.
last_seen_at DateTime When the attribute key was last seen.
example_raw_key String(128)? Example of the raw key before normalization.
created_at DateTime Row creation timestamp.
updated_at DateTime? Last update timestamp.

MemoryIntention

Prospective memory -- future intentions with time-based or event-based triggers.

Column Type Description
id UUID Primary key.
agent_id Text Owner agent ID (any string: UUID, slug, numeric, etc.).
trigger_type String(16) Trigger type ("time", "event", etc.).
trigger_condition Text Natural-language description of the trigger condition.
intended_action Text What the agent should do when triggered.
due_date DateTime? Optional due date for time-based triggers.
status String(16) Status ("pending", "triggered", "fulfilled", "expired"). Default "pending".
trigger_embedding_vec Vector(1536) Embedding of the trigger condition for semantic matching.
source_context String(32)? Source context identifier.
outcome_note Text? Note about the outcome after fulfillment.
created_at DateTime Row creation timestamp.
triggered_at DateTime? When the intention was triggered.
fulfilled_at DateTime? When the intention was fulfilled.

SessionObservation

Persistent session observations created by the LLM-driven Observer -- captures in-session context that persists across turns.

Column Type Description
id UUID Primary key.
agent_id Text Owner agent ID (any string: UUID, slug, numeric, etc.).
content Text Observation content text.
topic String(64)? Topic tag for the observation.
entities_mentioned JSONB List of entity keys mentioned in the observation.
created_at DateTime Row creation timestamp.
referenced_at DateTime? When the observation was last referenced.
relative_offset String(64)? Relative time offset descriptor (e.g. "2 messages ago").
source_message_ids JSONB List of source message IDs that originated this observation.
is_active Boolean Whether the observation is active.
merged_into_id UUID? ID of the observation this one was merged into.
emotion_label String(32)? Detected emotion label for the session context.
embedding_vec Vector(1536) Observation embedding for semantic retrieval.