Data Types Reference¶

This page documents all dataclasses, enums, and result types used across the write pipeline, read pipeline, and background jobs that are not covered in the main API Reference.

Write Pipeline Types¶

InputType¶

class InputType(str, Enum)

Input text classification types, determined by heuristics in classify_input().

Value	Description
`SHORT`	Less than 500 characters.
`MEDIUM`	500-2000 characters, unstructured.
`LONG`	More than 2000 characters, unstructured.
`STRUCTURED`	More than 500 characters with headers, bullets, or tables.

ExtractionMode¶

class ExtractionMode(str, Enum)

Value	Description
`SINGLE_SHOT`	Single LLM call for extraction.
`CHUNKED`	Input is split into chunks, each processed separately.

InputClassification¶

@dataclass
class InputClassification

Result of classify_input(). See Write Pipeline API for full field reference.

ExtractionStrategy¶

@dataclass
class ExtractionStrategy

Result of select_strategy(). See Write Pipeline API for full field reference.

CorrectionResult¶

@dataclass
class CorrectionResult

Result of correction detection.

Field	Type	Default	Description
`corrections_detected`	`int`	`0`	Number of corrections found.
`corrected_keys`	`list[str]`	`[]`	Attribute keys that were corrected.
`facts_corrected_ids`	`list[str]`	`[]`	IDs of old facts that were corrected.

Read Pipeline Types¶

ExpandedQuery¶

@dataclass
class ExpandedQuery

Result of query expansion (entity priming).

Field	Type	Description
`primed_entities`	`list[str]`	Entity keys discovered via alias + KG priming.
`temporal_range`	`tuple[datetime, datetime] \| None`	Resolved date range.
`expanded_terms`	`list[str]`	Additional context terms from entity facts.

PatternQuery¶

@dataclass
class PatternQuery

A pattern-based query for keyword signal matching.

Field	Type	Default	Description
`entity_pattern`	`str`	--	SQL LIKE pattern for entity_key matching.
`attribute_filter`	`str \| None`	`None`	Optional attribute key filter.

RetrievalPlan¶

@dataclass
class RetrievalPlan

Output of the retrieval agent LLM planner. See Read Pipeline API for full field reference.

GraphRetrievalResult¶

@dataclass
class GraphRetrievalResult

Result of graph-based BFS 2-hop retrieval.

Field	Type	Default	Description
`facts`	`list[dict[str, Any]]`	`[]`	Scored fact dicts with `source="graph"`.
`neighbor_keys`	`list[str]`	`[]`	Entity keys discovered via BFS.
`edges_traversed`	`int`	`0`	Total edges examined during BFS.
`edges`	`list[dict[str, Any]]`	`[]`	Deduplicated edge dicts with display names.

SpreadingActivationResult¶

@dataclass
class SpreadingActivationResult

Result of spreading activation expansion.

Field	Type	Default	Description
`candidates`	`list[RetrievalCandidate]`	`[]`	Expanded candidates from hop 1-2.
`meta_observations`	`list[Any]`	`[]`	Relevant meta-observations referencing seed facts.
`entities_explored`	`list[str]`	`[]`	Entity keys explored during spreading.
`clusters_explored`	`list[str]`	`[]`	Cluster IDs explored during spreading.
`hop1_count`	`int`	`0`	Number of facts found in hop 1.
`hop2_count`	`int`	`0`	Number of facts found in hop 2.
`kg_relationships_explored`	`int`	`0`	Number of KG relationships traversed.

CompressedContext¶

@dataclass
class CompressedContext

Result of context compression (tiered hot/warm/cold).

Field	Type	Default	Description
`context_text`	`str`	`""`	Final prompt-ready context string.
`hot_count`	`int`	`0`	Number of facts in hot tier (Tier 1).
`warm_count`	`int`	`0`	Number of facts in warm tier (Tier 2).
`cold_count`	`int`	`0`	Number of items in cold tier (Tier 3).
`total_tokens`	`int`	`0`	Estimated token count of context_text.

EmotionalTrendsResult¶

@dataclass
class EmotionalTrendsResult

Result of emotional trend materialization.

Field	Type	Default	Description
`emotion_counts`	`dict[str, int]`	`{}`	Mapping of emotion to occurrence count.
`trend_direction`	`str`	`"stable"`	`"increasing"`, `"decreasing"`, or `"stable"`.
`dominant_emotion`	`str \| None`	`None`	Most frequent emotion.
`trigger_keywords`	`list[str]`	`[]`	Top keywords from high-intensity events.
`avg_intensity`	`float`	`0.0`	Average emotion intensity.
`dominant_intensity`	`float`	`0.0`	Average intensity of the dominant emotion.
`dominant_energy`	`str`	`"medium"`	Predominant energy level.
`events_analyzed`	`int`	`0`	Number of events analyzed.
`observation_created`	`bool`	`False`	Whether a meta-observation was created/updated.
`observation_id`	`str \| None`	`None`	ID of the created/updated observation.

DirectiveBlock¶

@dataclass
class DirectiveBlock

Result of directive generation (procedural memory).

Field	Type	Default	Description
`text`	`str`	`""`	Cohesive behavioral instructions block.
`directive_count`	`int`	`0`	Number of active directives used.
`cache_hit`	`bool`	`False`	Whether this was served from cache.

ContradictionResult¶

@dataclass
class ContradictionResult

Result of contradiction check between directives.

Field	Type	Default	Description
`has_contradiction`	`bool`	`False`	Whether a contradiction was found.
`conflicting_directive`	`str \| None`	`None`	Title of the conflicting directive.
`resolution`	`str \| None`	`None`	Explanation of the resolution.

Background Job Result Types¶

ClusteringResult¶

@dataclass
class ClusteringResult

Result of fact clustering.

Field	Type	Default	Description
`clusters_created`	`int`	`0`	Number of new clusters created.
`clusters_reinforced`	`int`	`0`	Number of existing clusters updated.
`summaries_generated`	`int`	`0`	Number of cluster summaries generated via LLM.
`facts_assigned`	`int`	`0`	Number of facts assigned to clusters.

CommunityDetectionResult¶

@dataclass
class CommunityDetectionResult

Result of cross-entity community detection.

Field	Type	Default	Description
`communities_created`	`int`	`0`	New community observations created.
`communities_reinforced`	`int`	`0`	Existing community observations reinforced.
`clusters_in_communities`	`int`	`0`	Total clusters assigned to communities.
`skipped`	`bool`	`False`	Whether detection was skipped.
`skip_reason`	`str \| None`	`None`	Reason for skipping.

ConsolidationResult¶

@dataclass
class ConsolidationResult

Result of L2/L3 consolidation.

Field	Type	Default	Description
`events_processed`	`int`	`0`	Number of events analyzed.
`observations_created`	`int`	`0`	New meta-observations created.
`observations_reinforced`	`int`	`0`	Existing observations reinforced.
`skipped`	`bool`	`False`	Whether consolidation was skipped.
`skip_reason`	`str \| None`	`None`	Reason for skipping.

MemifyResult¶

@dataclass
class MemifyResult

Result of the memify pipeline (vitality scoring, staleness marking, edge management).

Field	Type	Default	Description
`facts_scored`	`int`	`0`	Number of facts scored for vitality.
`facts_marked_stale`	`int`	`0`	Number of facts marked as stale.
`edges_reinforced`	`int`	`0`	Number of KG edges reinforced.
`merges_executed`	`int`	`0`	Number of entity merges executed.

EntityImportanceResult¶

@dataclass
class EntityImportanceResult

Result of entity importance scoring (sleep-time compute).

Field	Type	Default	Description
`entities_scored`	`int`	`0`	Number of entities scored.
`top_entities`	`list[tuple[str, float]]`	`[]`	Top entities by score (key, score) pairs.

SummaryRefreshResult¶

@dataclass
class SummaryRefreshResult

Result of entity summary refresh (sleep-time compute).

Field	Type	Default	Description
`summaries_refreshed`	`int`	`0`	Number of summaries generated.
`summaries_skipped`	`int`	`0`	Number of entities skipped.

Background Functions¶

tag_event_emotion¶

Infer emotion, intensity, and energy from event text via LLM.

async def tag_event_emotion(
    event_text: str,
    llm: LLMProvider,
) -> dict[str, Any] | None

Parameter	Type	Description
`event_text`	`str`	Text to analyze.
`llm`	`LLMProvider`	Injected LLM provider.

Returns: Dict with emotion, intensity, energy keys, or None on failure.

from arandu.background import tag_event_emotion

result = await tag_event_emotion("I'm so happy today!", llm)
# {"emotion": "joy", "intensity": 0.85, "energy": "high"}

Database Models¶

The SQLAlchemy models below define the persistence layer. They live in arandu.models and are useful for advanced queries executed directly against the database.

erDiagram
    MemoryEvent ||--o{ MemoryFact : "source_event_id"
    MemoryFact ||--o{ MemoryFactEntityLink : "fact_id"
    MemoryFact ||--o| MemoryCluster : "cluster_id"
    MemoryFact ||--o| MemoryFact : "supersedes_fact_id"
    MemoryEntity ||--o{ MemoryEntityAlias : "canonical_entity_key"
    MemoryEntity ||--o{ MemoryFactEntityLink : "entity_key"
    MemoryEntity ||--o{ MemoryEntityRelationship : "source/target"
    MemoryEntityRelationship ||--o| MemoryFact : "evidence_fact_id"
    MemoryFact }o--o| MemoryAttributeRegistry : "attribute_key"
    MemoryIntention }o--|| MemoryEvent : "agent_id"
    SessionObservation }o--|| MemoryEvent : "agent_id"

    MemoryEvent {
        UUID id PK
        Text agent_id
        DateTime occurred_at
        Text text
        Vector embedding_vec
    }
    MemoryFact {
        UUID id PK
        Text agent_id
        String entity_key
        Text fact_text
        Float confidence
        DateTime valid_from
        DateTime valid_to
        Vector embedding_vec
    }
    MemoryEntity {
        UUID id PK
        Text agent_id
        String canonical_key UK
        String display_name
        String entity_type
        Text summary_text
        Float importance_score
    }
    MemoryFactEntityLink {
        UUID id PK
        UUID fact_id FK
        String entity_key
        Boolean is_primary
        Text agent_id
    }
    MemoryEntityRelationship {
        UUID id PK
        Text agent_id
        String source_entity_key
        String target_entity_key
        String rel_type
        Float strength
        UUID evidence_fact_id FK
    }
    MemoryEntityAlias {
        UUID id PK
        Text agent_id
        String alias UK
        String canonical_entity_key
    }
    MemoryCluster {
        UUID id PK
        Text agent_id
        String label
        Text summary_text
        Vector embedding_vec
    }
    MemoryMetaObservation {
        UUID id PK
        Text agent_id
        String observation_type
        Text text
        Float confidence
    }
    MemoryAttributeRegistry {
        UUID id PK
        String key UK
        String status
        String value_type
        Integer seen_count
    }
    MemoryIntention {
        UUID id PK
        Text agent_id
        String trigger_type
        Text trigger_condition
        Text intended_action
        String status
    }
    SessionObservation {
        UUID id PK
        Text agent_id
        Text content
        String topic
        Boolean is_active
    }

MemoryFact¶

Versioned fact ledger - stores structured facts with validity windows.

Column	Type	Description
`id`	`UUID`	Primary key.
`agent_id`	`Text`	Owner agent ID (any string: UUID, slug, numeric, etc.).
`entity_type`	`String`	Entity type (e.g. `"person"`).
`entity_key`	`String`	Canonical entity key (e.g. `"person:ana"`).
`entity_name`	`String?`	Human-readable entity name.
`attribute_key`	`String?`	Attribute key (e.g. `"occupation"`).
`fact_text`	`Text`	Natural-language fact sentence.
`category`	`String(50)?`	Fact category.
`confidence`	`Float`	Confidence score (default 0.8).
`importance`	`Float`	Importance score (default 0.5).
`is_sensitive`	`Boolean`	Whether the fact contains sensitive data.
`valid_from`	`DateTime`	Start of the validity window.
`valid_to`	`DateTime?`	End of validity (`NULL` = currently active).
`ttl_days`	`Integer?`	Optional time-to-live in days.
`source_event_id`	`UUID?`	FK to `MemoryEvent`.
`supersedes_fact_id`	`UUID?`	ID of the fact this one replaces.
`embedding_vec`	`Vector(1536)`	pgvector embedding for semantic search.
`vitality_score`	`Float?`	Sleep-time vitality score.
`is_stale`	`Boolean`	Whether marked stale by memify.
`cluster_id`	`UUID?`	FK to `MemoryCluster`.
`value_json`	`JSONB?`	Structured value (JSON) for the attribute.
`needs_confirmation`	`Boolean`	Whether the fact requires user confirmation.
`last_confirmed_at`	`DateTime?`	When the fact was last confirmed by the user.
`times_retrieved`	`Integer`	How many times this fact was retrieved.
`last_retrieved_at`	`DateTime?`	When the fact was last retrieved.
`user_correction_count`	`Integer`	Number of times the user corrected this fact.
`source_context`	`String(512)?`	Source context of the original input.
`agent_annotation`	`Text?`	Free-text annotation added by the agent.
`embedding`	`JSONB?`	Raw embedding as JSON (non-pgvector fallback).
`search_vector`	`TSVECTOR`	Full-text search index column.
`created_at`	`DateTime`	Row creation timestamp.
`ingested_at`	`DateTime`	Bi-temporal ingestion timestamp.
`invalidated_at`	`DateTime?`	When the fact was invalidated (bi-temporal).

MemoryEntity¶

First-class entity node in the knowledge graph.

Column	Type	Description
`id`	`UUID`	Primary key.
`agent_id`	`Text`	Owner agent ID (any string: UUID, slug, numeric, etc.).
`canonical_key`	`String(128)`	Unique canonical key (e.g. `"person:ana"`).
`display_name`	`String(256)?`	Human-readable display name.
`entity_type`	`String(32)`	Type (`"person"`, `"organization"`, etc.).
`summary_text`	`Text?`	LLM-generated entity summary.
`embedding_vec`	`Vector(1536)`	Entity embedding.
`fact_count`	`Integer`	Number of linked facts.
`importance_score`	`Float?`	Sleep-time importance score.
`is_active`	`Boolean`	Whether the entity is active.
`first_seen_at`	`DateTime`	When the entity was first observed.
`last_seen_at`	`DateTime`	When the entity was last observed.
`summary_refreshed_at`	`DateTime?`	When the entity summary was last refreshed.
`created_at`	`DateTime`	Row creation timestamp.

Unique constraint: (agent_id, canonical_key).

MemoryEntityAlias¶

Maps alias names to canonical entity keys for entity resolution.

Column	Type	Description
`id`	`UUID`	Primary key.
`agent_id`	`Text`	Owner agent ID (any string: UUID, slug, numeric, etc.).
`alias`	`String`	Alias text (e.g. `"Ana"`).
`canonical_entity_key`	`String`	Target canonical key.
`canonical_entity_type`	`String`	Target entity type.
`created_at`	`DateTime`	Row creation timestamp.

Unique constraint: (agent_id, alias).

MemoryEntityRelationship¶

Directed edge between two entities in the knowledge graph.

Column	Type	Description
`id`	`UUID`	Primary key.
`agent_id`	`Text`	Owner agent ID (any string: UUID, slug, numeric, etc.).
`source_entity_key`	`String(128)`	Source entity canonical key.
`target_entity_key`	`String(128)`	Target entity canonical key.
`rel_type`	`String(64)`	Relationship type (e.g. `"works_at"`, `"mentored_by"`).
`strength`	`Float`	Edge strength (default 0.8).
`evidence_fact_id`	`UUID?`	FK to the fact that evidences this edge.
`provenance`	`String(16)`	How the edge was created (`"rule"`, `"llm"`).
`valid_from`	`DateTime`	Start of validity.
`valid_to`	`DateTime?`	End of validity (`NULL` = active).
`created_at`	`DateTime`	Row creation timestamp.
`updated_at`	`DateTime`	Last update timestamp.
`last_used_at`	`DateTime?`	When the relationship was last used in retrieval.
`invalidated_at`	`DateTime?`	When the relationship was invalidated.

Unique constraint: (agent_id, source_entity_key, target_entity_key, rel_type).

Dynamic Relationship Types¶

The rel_type field accepts any short, descriptive snake_case string - it is not restricted to a fixed set. The extraction pipeline instructs the LLM to choose the most descriptive type for each relationship.

Common types (used as examples in the extraction prompt, not as restrictions):

works_at, manages, reports_to, family_of, friend_of, partner_of, owns, lives_in, member_of, studies_at, works_with

The LLM may also produce types like mentored_by, inspired_by, competed_with, or any other descriptive type.

Normalization: All relationship types are normalized via normalize_rel_type() before persistence:

Lowercase + underscores (e.g. "Mentored By" → "mentored_by")
Known aliases are mapped to common types (e.g. "boss" → "reports_to", "spouse" → "partner_of")
Unknown types pass through after sanitization

The CANONICAL_REL_TYPES set in arandu.constants is available as a reference for consumers who want to filter by known types, but it is not used as a validation filter.

See Evidence Linkage & Cascade Invalidation for how relationships are linked to supporting facts and automatically cleaned up when facts change.

MemoryEvent¶

Immutable event log - stores all user messages with embeddings.

Column	Type	Description
`id`	`UUID`	Primary key.
`agent_id`	`Text`	Owner agent ID (any string: UUID, slug, numeric, etc.).
`occurred_at`	`DateTime`	When the event happened.
`text`	`Text`	Event text content.
`source`	`String`	Origin (default `"api"`).
`importance`	`Float`	Importance score (default 0.5).
`embedding_vec`	`Vector(1536)`	Event embedding for retrieval.
`embedding`	`JSONB?`	Raw embedding as JSON (non-pgvector fallback).
`trace_json`	`JSONB?`	Trace/debug metadata for the event.
`created_at`	`DateTime`	Row creation timestamp.
`emotion_primary`	`String(32)?`	Primary emotion label.
`emotion_intensity`	`Float?`	Emotion intensity (0-1).
`energy_level`	`String(16)?`	Energy level (`"low"`, `"medium"`, `"high"`).

MemoryCluster¶

Semantic cluster grouping related facts for richer context.

Column	Type	Description
`id`	`UUID`	Primary key.
`agent_id`	`Text`	Owner agent ID (any string: UUID, slug, numeric, etc.).
`label`	`String(128)`	Cluster label.
`summary_text`	`Text?`	LLM-generated cluster summary.
`cluster_type`	`String(32)`	Cluster type (default `"auto"`).
`fact_count`	`Integer`	Number of facts in the cluster.
`importance`	`Float`	Cluster importance (default 0.5).
`embedding_vec`	`Vector(1536)`	Cluster embedding.
`is_active`	`Boolean`	Whether the cluster is active.
`last_updated_at`	`DateTime`	When the cluster was last updated.
`created_at`	`DateTime`	Row creation timestamp.

MemoryMetaObservation¶

Meta-observations derived from consolidation - patterns, insights, trends.

Column	Type	Description
`id`	`UUID`	Primary key.
`agent_id`	`Text`	Owner agent ID (any string: UUID, slug, numeric, etc.).
`observation_type`	`String(32)`	Type (`"pattern"`, `"trend"`, `"community"`, etc.).
`title`	`String(256)`	Short title.
`text`	`Text`	Full observation text.
`supporting_event_ids`	`JSONB`	List of supporting event UUIDs.
`supporting_fact_ids`	`JSONB`	List of supporting fact UUIDs.
`confidence`	`Float`	Confidence (default 0.7).
`importance`	`Float`	Importance (default 0.5).
`times_reinforced`	`Integer`	How many times this observation was reinforced.
`is_active`	`Boolean`	Whether the observation is active.
`embedding_vec`	`Vector(1536)`	Observation embedding.
`first_detected_at`	`DateTime`	When the observation was first detected.
`last_reinforced_at`	`DateTime`	When the observation was last reinforced.
`created_at`	`DateTime`	Row creation timestamp.

MemoryFactEntityLink¶

Cross-reference table linking each fact to ALL entities it mentions - enables cross-entity retrieval without fact duplication.

Column	Type	Description
`id`	`UUID`	Primary key.
`fact_id`	`UUID`	FK to `MemoryFact`.
`entity_key`	`String`	Entity canonical key (e.g. `"person:clara_rezende"`).
`is_primary`	`Boolean`	Whether this entity is the fact's primary subject.
`agent_id`	`Text`	Owner agent ID.

Unique constraint: (fact_id, entity_key).

Indexes: (agent_id, entity_key) for retrieval, (fact_id) for cascade deletes.

MemoryAttributeRegistry¶

Registry for managing attribute keys - tracks proposed vs active keys.

Column	Type	Description
`id`	`UUID`	Primary key.
`key`	`String(64)`	Unique attribute key.
`status`	`String(20)`	`"proposed"` or `"active"`.
`value_type`	`String(20)`	Expected value type (default `"string"`).
`conflict_policy`	`String(20)`	How to handle conflicts (default `"supersede"`).
`ttl_days`	`Integer?`	Optional default TTL for facts with this key.
`seen_count`	`Integer`	How many times this key has been seen.
`proposed_by`	`String(20)`	Who proposed the key (`"llm"`, `"user"`).
`reason`	`Text?`	Why the key was proposed.
`first_seen_at`	`DateTime`	When the attribute key was first seen.
`last_seen_at`	`DateTime`	When the attribute key was last seen.
`example_raw_key`	`String(128)?`	Example of the raw key before normalization.
`created_at`	`DateTime`	Row creation timestamp.
`updated_at`	`DateTime?`	Last update timestamp.

MemoryIntention¶

Prospective memory -- future intentions with time-based or event-based triggers.

Column	Type	Description
`id`	`UUID`	Primary key.
`agent_id`	`Text`	Owner agent ID (any string: UUID, slug, numeric, etc.).
`trigger_type`	`String(16)`	Trigger type (`"time"`, `"event"`, etc.).
`trigger_condition`	`Text`	Natural-language description of the trigger condition.
`intended_action`	`Text`	What the agent should do when triggered.
`due_date`	`DateTime?`	Optional due date for time-based triggers.
`status`	`String(16)`	Status (`"pending"`, `"triggered"`, `"fulfilled"`, `"expired"`). Default `"pending"`.
`trigger_embedding_vec`	`Vector(1536)`	Embedding of the trigger condition for semantic matching.
`source_context`	`String(32)?`	Source context identifier.
`outcome_note`	`Text?`	Note about the outcome after fulfillment.
`created_at`	`DateTime`	Row creation timestamp.
`triggered_at`	`DateTime?`	When the intention was triggered.
`fulfilled_at`	`DateTime?`	When the intention was fulfilled.

SessionObservation¶

Persistent session observations created by the LLM-driven Observer -- captures in-session context that persists across turns.

Column	Type	Description
`id`	`UUID`	Primary key.
`agent_id`	`Text`	Owner agent ID (any string: UUID, slug, numeric, etc.).
`content`	`Text`	Observation content text.
`topic`	`String(64)?`	Topic tag for the observation.
`entities_mentioned`	`JSONB`	List of entity keys mentioned in the observation.
`created_at`	`DateTime`	Row creation timestamp.
`referenced_at`	`DateTime?`	When the observation was last referenced.
`relative_offset`	`String(64)?`	Relative time offset descriptor (e.g. `"2 messages ago"`).
`source_message_ids`	`JSONB`	List of source message IDs that originated this observation.
`is_active`	`Boolean`	Whether the observation is active.
`merged_into_id`	`UUID?`	ID of the observation this one was merged into.
`emotion_label`	`String(32)?`	Detected emotion label for the session context.
`embedding_vec`	`Vector(1536)`	Observation embedding for semantic retrieval.