Database Architecture¶
Overview¶
Aegis uses PostgreSQL 16 with pgvector extension as the primary relational database, providing structured storage, vector similarity search, and full-text search capabilities.
Database Stack¶
PostgreSQL 16¶
Installation: Native (apt package, not containerized)
Host: localhost (LXC host)
Port: 5432
Database: aegis
User: agent
Version Info:
Why Not Containerized? - Better performance (direct filesystem access) - Easier backups (native pg_dump) - Simpler resource management - Persistent across container restarts
pgvector Extension¶
Version: 0.5.1+ Purpose: Vector similarity search for embeddings
Installation:
Capabilities:
- Store embeddings as vector(N) type
- Cosine similarity search (<=> operator)
- Euclidean distance search (<-> operator)
- Inner product search (<#> operator)
- HNSW indexing for fast approximate nearest neighbor search
Database Size¶
| Metric | Value |
|---|---|
| Total Database Size | ~150 MB |
| Largest Table | container_metrics (58 MB) |
| Total Rows | ~50K across all tables |
| Indexes | ~80 indexes |
| Extensions | pgvector, uuid-ossp, pg_stat_statements |
Schema Organization¶
Memory Tables¶
episodic_memory¶
Purpose: Event logs, interactions, experiences with temporal ordering
Schema:
CREATE TABLE episodic_memory (
id SERIAL PRIMARY KEY,
timestamp TIMESTAMPTZ DEFAULT NOW(),
event_type VARCHAR(100) NOT NULL,
summary TEXT NOT NULL,
details JSONB,
context JSONB,
importance memory_importance DEFAULT 'medium',
embedding vector(1536),
tags TEXT[],
related_memories INTEGER[],
created_at TIMESTAMPTZ DEFAULT NOW(),
decision TEXT,
metadata JSONB,
graphiti_synced_at TIMESTAMPTZ
);
Indexes:
- Primary key: id
- HNSW vector index: embedding vector_cosine_ops (for semantic search)
- B-tree: timestamp DESC (recent events)
- B-tree: event_type (filter by type)
- B-tree: importance (priority queries)
- GIN: tags (tag-based search)
Row Count: ~18,682 records Disk Size: ~10 MB
Common Event Types:
- task_complete: Task completion events
- decision: Decision tracking (linked to decision_outcomes)
- error: Error occurrences
- learning: Lessons learned
- milestone: Significant achievements
Importance Enum:
semantic_memory¶
Purpose: Factual knowledge, concepts, learnings
Schema:
CREATE TABLE semantic_memory (
id SERIAL PRIMARY KEY,
concept VARCHAR(200) NOT NULL,
content TEXT NOT NULL,
category VARCHAR(100),
source VARCHAR(200),
confidence FLOAT DEFAULT 1.0,
metadata JSONB,
tags TEXT[],
embedding vector(1536),
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
Indexes:
- Primary key: id
- HNSW vector index: embedding
- B-tree: concept (exact lookup)
- B-tree: category (categorical queries)
- B-tree: updated_at DESC (recent updates)
Row Count: ~500 records Disk Size: ~240 KB
decision_outcomes¶
Purpose: Track decisions with alternatives and outcomes
Schema:
CREATE TABLE decision_outcomes (
id SERIAL PRIMARY KEY,
decision_id INTEGER REFERENCES episodic_memory(id),
task_context TEXT NOT NULL,
alternatives JSONB NOT NULL,
chosen_option TEXT NOT NULL,
reasoning TEXT NOT NULL,
outcome_status VARCHAR(20) DEFAULT 'pending',
outcome_details JSONB,
lessons_learned TEXT,
user_feedback VARCHAR(20),
created_at TIMESTAMPTZ DEFAULT NOW(),
resolved_at TIMESTAMPTZ
);
Outcome Status Values: pending, success, partial, failure
User Feedback Values: positive, negative, neutral
API & Monetization Tables¶
api_keys¶
Purpose: API key management for customer access
Schema:
CREATE TABLE api_keys (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
customer_id VARCHAR(100) NOT NULL,
key_hash TEXT NOT NULL UNIQUE,
key_prefix VARCHAR(20) NOT NULL,
name VARCHAR(200),
tier VARCHAR(50) DEFAULT 'free',
rate_limit INTEGER DEFAULT 60,
monthly_quota INTEGER DEFAULT 1000,
enabled BOOLEAN DEFAULT TRUE,
expires_at TIMESTAMPTZ,
created_at TIMESTAMPTZ DEFAULT NOW(),
last_used_at TIMESTAMPTZ
);
Indexes:
- Primary key: id
- Unique: key_hash (fast API key lookup)
- B-tree: customer_id (customer queries)
- B-tree: enabled (active keys only)
Disk Size: ~2.3 MB
Tiers: free, developer, pro, enterprise
api_usage_events¶
Purpose: Track API usage for billing and rate limiting
Schema:
CREATE TABLE api_usage_events (
id BIGSERIAL PRIMARY KEY,
api_key_id UUID REFERENCES api_keys(id),
customer_id VARCHAR(100) NOT NULL,
endpoint VARCHAR(200) NOT NULL,
method VARCHAR(10) NOT NULL,
status_code INTEGER NOT NULL,
response_time_ms INTEGER,
tokens_used INTEGER DEFAULT 0,
cost_usd DECIMAL(10, 6) DEFAULT 0,
created_at TIMESTAMPTZ DEFAULT NOW()
);
Partitioning: By month (future optimization) Retention: 13 months (rolling)
Indexes:
- Primary key: id
- B-tree: api_key_id (per-key metrics)
- B-tree: customer_id (per-customer billing)
- B-tree: created_at DESC (time-series queries)
billing_events¶
Purpose: Track billable events and charges
Schema:
CREATE TABLE billing_events (
id BIGSERIAL PRIMARY KEY,
customer_id VARCHAR(100) NOT NULL,
event_type VARCHAR(50) NOT NULL,
amount_usd DECIMAL(10, 2) NOT NULL,
currency VARCHAR(3) DEFAULT 'USD',
description TEXT,
metadata JSONB,
stripe_invoice_id VARCHAR(100),
created_at TIMESTAMPTZ DEFAULT NOW()
);
Event Types: api_usage, subscription, overage, credit
System Tables¶
claude_sessions¶
Purpose: Track Claude Code sessions and commands
Schema:
CREATE TABLE claude_sessions (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
session_id VARCHAR(100) UNIQUE NOT NULL,
user_id VARCHAR(100),
command TEXT,
status VARCHAR(20) DEFAULT 'running',
started_at TIMESTAMPTZ DEFAULT NOW(),
ended_at TIMESTAMPTZ,
metadata JSONB
);
Disk Size: ~232 KB
container_metrics¶
Purpose: Docker container resource usage over time
Schema:
CREATE TABLE container_metrics (
id BIGSERIAL PRIMARY KEY,
container_name VARCHAR(100) NOT NULL,
cpu_percent FLOAT,
mem_usage_mb FLOAT,
mem_percent FLOAT,
net_io_mb FLOAT,
block_io_mb FLOAT,
timestamp TIMESTAMPTZ DEFAULT NOW()
);
Disk Size: ~58 MB (largest table) Row Count: ~500K records Retention: 30 days (rolling)
Indexes:
- Primary key: id
- B-tree: container_name, timestamp DESC (time-series per container)
infra_anomaly_alerts¶
Purpose: Infrastructure anomaly detection alerts
Disk Size: ~11 MB Purpose: Track unusual system behavior (CPU spikes, memory leaks, disk issues)
Workflow Tables¶
workflows¶
Purpose: Store workflow definitions and execution state
Schema:
CREATE TABLE workflows (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
name VARCHAR(100) NOT NULL,
graph JSONB NOT NULL,
status VARCHAR(20) DEFAULT 'active',
created_at TIMESTAMPTZ DEFAULT NOW(),
updated_at TIMESTAMPTZ DEFAULT NOW()
);
workflow_executions¶
Purpose: Track workflow execution history
Schema:
CREATE TABLE workflow_executions (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
workflow_id UUID REFERENCES workflows(id),
context JSONB NOT NULL,
current_node VARCHAR(100),
status VARCHAR(20) DEFAULT 'running',
error TEXT,
started_at TIMESTAMPTZ DEFAULT NOW(),
completed_at TIMESTAMPTZ
);
Status Values: running, completed, failed, interrupted
Research & Intel Tables¶
deep_briefing_jobs¶
Purpose: Track Deep Research jobs (Gemini/Perplexity integration)
Schema:
CREATE TABLE deep_briefing_jobs (
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
query TEXT NOT NULL,
status VARCHAR(20) DEFAULT 'pending',
result JSONB,
provider VARCHAR(50),
cost_usd DECIMAL(10, 4),
created_at TIMESTAMPTZ DEFAULT NOW(),
completed_at TIMESTAMPTZ
);
Providers: gemini, perplexity, claude
competitors¶
Purpose: Track competitor websites for monitoring
Schema:
CREATE TABLE competitors (
id SERIAL PRIMARY KEY,
name VARCHAR(200) NOT NULL,
domain VARCHAR(200) NOT NULL,
category VARCHAR(100),
last_checked_at TIMESTAMPTZ,
enabled BOOLEAN DEFAULT TRUE,
created_at TIMESTAMPTZ DEFAULT NOW()
);
competitor_pages¶
Purpose: Track specific competitor pages (pricing, features, etc.)
Schema:
CREATE TABLE competitor_pages (
id SERIAL PRIMARY KEY,
competitor_id INTEGER REFERENCES competitors(id),
url TEXT NOT NULL,
page_type VARCHAR(50),
content_hash VARCHAR(64),
screenshot_url TEXT,
last_checked_at TIMESTAMPTZ,
created_at TIMESTAMPTZ DEFAULT NOW()
);
Page Types: pricing, features, homepage, docs, blog
competitor_changes¶
Purpose: Track detected changes to competitor pages
Schema:
CREATE TABLE competitor_changes (
id SERIAL PRIMARY KEY,
page_id INTEGER REFERENCES competitor_pages(id),
change_type VARCHAR(50),
old_value TEXT,
new_value TEXT,
diff TEXT,
detected_at TIMESTAMPTZ DEFAULT NOW()
);
Performance Optimization¶
Indexing Strategy¶
HNSW Vector Indexes: - Used for approximate nearest neighbor search - 10x faster than sequential scan for similarity queries - Trade-off: 90%+ accuracy vs. 100% (IVFFlat)
B-tree Indexes: - Primary keys and foreign keys (automatic) - Timestamp columns (time-series queries) - Frequently filtered columns (event_type, customer_id)
GIN Indexes: - Array columns (tags) - JSONB columns (when querying specific keys)
Partial Indexes (future optimization):
Query Optimization¶
Explain Analyze Example:
EXPLAIN ANALYZE
SELECT * FROM episodic_memory
WHERE embedding <=> '[0.1, 0.2, ...]'::vector < 0.5
ORDER BY embedding <=> '[0.1, 0.2, ...]'::vector
LIMIT 10;
Typical Plan:
Limit (cost=10.23..12.45 rows=10 width=...)
-> Index Scan using idx_episodic_embedding on episodic_memory
Order By: embedding <=> '...'::vector
Filter: (embedding <=> '...'::vector) < 0.5
Slow Query Logging:
-- Enable slow query log (queries > 1 second)
ALTER DATABASE aegis SET log_min_duration_statement = 1000;
Connection Pooling¶
pgBouncer: Not currently used Application-Level Pooling: psycopg2 connection pool - Min connections: 2 - Max connections: 10 - Idle timeout: 300 seconds
Configuration (Python):
from psycopg2 import pool
db_pool = pool.ThreadedConnectionPool(
minconn=2,
maxconn=10,
host='localhost',
port=5432,
database='aegis',
user='agent',
password='agent'
)
Backup Strategy¶
Nightly Backups¶
Schedule: Daily at 02:00 UTC
Script (~/scripts/backup_db.sh):
#!/bin/bash
DATE=$(date +%Y-%m-%d)
BACKUP_DIR="/home/agent/backups/postgres"
mkdir -p $BACKUP_DIR
pg_dump -h localhost -U agent -d aegis \
-F c -b -v -f "$BACKUP_DIR/aegis_$DATE.dump"
# Retain last 14 days
find $BACKUP_DIR -name "aegis_*.dump" -mtime +14 -delete
Backup Format: Custom format (compressed) Typical Size: ~30 MB compressed Retention: 14 days local, 90 days remote (Hetzner Storage Box)
Point-in-Time Recovery¶
WAL Archiving: Not currently enabled Future Enhancement: Continuous WAL archiving to S3-compatible storage
Restore Procedure¶
Full Restore:
# Drop and recreate database
dropdb -h localhost -U agent aegis
createdb -h localhost -U agent aegis
# Restore from backup
pg_restore -h localhost -U agent -d aegis \
/home/agent/backups/postgres/aegis_2024-01-25.dump
# Verify
psql -h localhost -U agent -d aegis -c "SELECT COUNT(*) FROM episodic_memory;"
Selective Restore:
# Restore single table
pg_restore -h localhost -U agent -d aegis \
-t episodic_memory \
/home/agent/backups/postgres/aegis_2024-01-25.dump
Security¶
Authentication¶
Method: Password-based (local connections only) pg_hba.conf:
# TYPE DATABASE USER ADDRESS METHOD
local all agent peer
host all agent 127.0.0.1/32 md5
host all agent ::1/128 md5
Why No Remote Access? - Database on same host as application - Reduced attack surface - Better performance (no network overhead)
Encryption¶
At Rest: Not enabled (LXC filesystem encryption provides this) In Transit: Not applicable (local-only connections)
Future: Enable pgcrypto for sensitive column encryption
Permissions¶
Database Owner: agent
Application User: agent (same user, simplified security model)
Schema Permissions:
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO agent;
GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA public TO agent;
Audit Logging¶
pg_stat_statements Extension:
CREATE EXTENSION pg_stat_statements;
-- View top 10 slowest queries
SELECT query, mean_exec_time, calls
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 10;
Maintenance¶
Vacuum and Analyze¶
Auto Vacuum: Enabled (default PostgreSQL settings) Manual Vacuum (weekly):
Full Vacuum (monthly):
Reindexing¶
Monthly Reindex (to reduce index bloat):
Concurrent Reindex (zero downtime):
Statistics Updates¶
Analyze After Bulk Inserts:
Migration Management¶
Schema Versioning¶
Tool: Alembic (SQLAlchemy migrations)
Location: /home/agent/projects/aegis-core/alembic/
Current Version: Check with:
Create Migration:
Apply Migration:
Rollback Migration:
Manual Schema Changes¶
Always Use Migrations: Never apply manual DDL in production
Emergency Rollback Procedure:
# Restore from backup
pg_restore -h localhost -U agent -d aegis \
--clean --if-exists \
/home/agent/backups/postgres/aegis_BEFORE_MIGRATION.dump
Monitoring¶
Key Metrics¶
Database Size:
Table Sizes:
SELECT schemaname, tablename,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size
FROM pg_tables
WHERE schemaname = 'public'
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;
Active Connections:
Cache Hit Ratio:
SELECT
sum(heap_blks_hit) / nullif(sum(heap_blks_hit) + sum(heap_blks_read), 0) AS cache_hit_ratio
FROM pg_statio_user_tables;
Target: > 0.99 (99% cache hit rate)
Alerts¶
Trigger Conditions: - Connection count > 80% of max_connections - Cache hit ratio < 95% - Slow queries > 5 seconds - Database size > 80% of allocated storage - Replication lag > 1 minute (when replication enabled)
Future Database Improvements¶
Q1 2026: Performance¶
- Enable connection pooling (pgBouncer)
- Implement table partitioning (api_usage_events by month)
- Add materialized views for common aggregations
- Optimize vector indexes (tune HNSW parameters)
Q2 2026: High Availability¶
- Set up streaming replication (read replica)
- Enable WAL archiving (point-in-time recovery)
- Implement automatic failover (Patroni)
- Multi-region replication (future)
Q3 2026: Advanced Features¶
- Full-text search with pg_search
- Time-series optimization (TimescaleDB extension)
- Graph queries (Apache AGE extension)
- Column-level encryption (pgcrypto)
Q4 2026: Data Warehouse¶
- Separate OLAP database (ClickHouse/Snowflake)
- ETL pipeline for analytics
- Data lake (S3 + Parquet)
- Business intelligence dashboards