Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.maximem.ai/llms.txt

Use this file to discover all available pages before exploring further.

Storage engines are managed by Synap Cloud. You do not need to provision or manage any infrastructure. Each Instance gets isolated namespaces in both engines, configured automatically during Instance provisioning.

Overview

Diagram showing vector store and graph store working together to feed retrieval results
The two engines serve fundamentally different query patterns:
  • Vector store: “Find memories that are semantically similar to this query”
  • Graph store: “Find all memories connected to this entity, topic, or relationship chain”
In practice, the most effective retrieval combines both: the vector store provides fast, relevance-ranked candidates, and the graph store enriches those candidates with connected context that pure similarity search would miss.

Vector store configuration

The vector store converts each extracted memory into a numerical embedding (a high-dimensional vector) and stores it in a searchable index. When your agent queries for context, the query is also converted to an embedding, and the vector store returns the most semantically similar memories.

How it works

1

Embedding generation

Each extracted memory (fact, preference, episode, emotion, temporal event) is converted into a vector embedding using the configured embedding model. The embedding captures the semantic meaning of the memory in numerical form.
2

Indexed storage

Embeddings are stored in a namespace-isolated vector index, organized by scope. Each embedding is associated with the full memory metadata (content, type, confidence, source, entities, timestamps).
3

Similarity search

When a retrieval query arrives, the query text is converted to an embedding using the same model. The vector store computes cosine similarity between the query embedding and all stored embeddings, returning the closest matches ranked by similarity score.

Embedding models in use

Synap currently embeds memories with text-embedding-3-small (1536 dimensions) by default, which gives the best balance of speed, quality, and cost for typical agent workloads. For instances that need higher retrieval precision, Synap may select text-embedding-3-large (3072 dimensions) based on your use-case file. The selection is automatic and the embedding model is not a user-tunable setting.

Vector store characteristics

CharacteristicDetail
Search typeSemantic similarity (cosine distance)
Typical latency~50ms for a query returning 10-20 results
Namespace isolationEach Instance has its own vector namespace
Scope filteringQueries are filtered by scope at the index level (no post-filtering overhead)
Used inBoth fast and accurate retrieval modes
Best forFinding content that is semantically related to a query, even when exact terms do not match

When vector search excels

Vector search is particularly effective when:
  • The query uses different words than the stored memory (“cost savings” matches “reduced expenses by 30%”)
  • You need fast results with low latency
  • The user asks broad, open-ended questions
  • You want to find topically related memories across different conversations

Graph store configuration

The graph store persists entity relationships and structured connections between memories. Rather than treating each memory as an independent document (as the vector store does), the graph store models the web of relationships: who is connected to whom, which facts are about which entities, and how topics and events are linked.

How it works

1

Entity and relationship extraction

During ingestion, the pipeline extracts not just entities but the relationships between them. “John Smith manages the engineering team at Acme Corp” produces three entities (John Smith, engineering team, Acme Corp) and two relationships (manages, part of).
2

Graph construction

Entities become nodes and relationships become edges in the Instance’s graph. Each node and edge carries metadata: type, scope, confidence, source, and timestamps. The graph grows organically as new content is ingested.
3

Relationship traversal

When a retrieval query targets a specific entity or topic, the graph store traverses connected nodes to find related memories. A query about “John Smith” might traverse to “engineering team”, then to “microservices migration”, surfacing memories that pure semantic search would not find.

Graph store characteristics

CharacteristicDetail
Search typeRelationship traversal (graph walk)
Typical latency~200ms for a query with 2-3 hops of traversal
Namespace isolationEach Instance has its own graph namespace
Scope filteringRelationships are scope-aware; traversal respects scope boundaries
Used inaccurate retrieval mode only
Best forFinding connected context, understanding relationships, building entity-centric views

When graph search excels

Graph search is particularly effective when:
  • You need to understand the relationships around a specific entity (“Tell me everything about Project Atlas”)
  • Multiple related pieces of information are spread across different conversations
  • The user asks about connections (“Who is involved in the Q4 planning?”)
  • You want to build a comprehensive picture from fragmented mentions

Comparing the engines

FeatureVector StoreGraph Store
Search typeSemantic similarityRelationship traversal
SpeedFast (~50ms)Moderate (~200ms)
Best forFinding similar contentFinding connected content
Used inFast + Accurate modesAccurate mode only
Handles ambiguityWell (semantic matching)Moderate (needs entity resolution)
Discovers indirect connectionsNoYes (multi-hop traversal)
Storage overheadModerate (one embedding per memory)Higher (nodes + edges + metadata)
Scales withNumber of memoriesNumber of relationships

How both engines feed retrieval

The two engines work together during retrieval, especially in accurate mode:
Diagram showing vector store and graph store results merging into ranked retrieval output

Fast mode

In fast retrieval mode, only the vector store is queried. This provides sub-100ms response times with semantically relevant results. Use this when latency is critical and you do not need deep relationship context.
Query -> Vector Store -> Rank by similarity -> Return results

Accurate mode

In accurate retrieval mode, both engines are queried and their results are merged:
Query -> Vector Store -> Candidates (semantic similarity)
      -> Graph Store  -> Candidates (relationship traversal)
      -> Merge & Deduplicate
      -> Cross-engine ranking (recency + relevance + confidence)
      -> Return results
Convergent evidence from two different search strategies is a strong signal of relevance: memories that appear in both engines’ results are boosted in the merged ranking.

Retention and storage lifecycle

Memories stored in both engines follow a lifecycle:
StageDescription
ActiveMemory is stored, indexed, and available for retrieval. This is the default state for all newly ingested memories.
StaleMemory has exceeded its retention window but has not been evicted yet. Still retrievable but deprioritized in ranking.
ArchivedMemory moved to cold storage. Not immediately retrievable but can be restored.
EvictedMemory permanently removed. Irreversible.
Retention behavior is set by Synap based on your agent’s use-case file — frequently accessed memories are retained longer, rarely accessed ones are archived or evicted sooner. See Customized Memory Architectures for how this is configured.

Monitoring storage

You can monitor storage usage and health through the Dashboard or SDK: Storage metrics are visible in the Dashboard. Navigate to Instances > select your Instance > Storage in the Synap Dashboard. The storage view shows:
  • Total memories stored in each engine
  • Storage size and growth trends
  • Query latency percentiles (p50, p95, p99)
  • Namespace utilization
  • Retention policy status (how many memories are stale, archived, or evicted)
Programmatic access to storage metrics is on the roadmap. Email support@maximem.ai if you need to export storage metrics to your own monitoring system.

Next steps

Customized Memory Architectures

How Synap auto-generates the memory configuration for your Instance from your use-case file.

Memories & Context

Understand how memories flow from ingestion through storage to retrieval.

Entity Resolution

Learn how entities power the graph store’s relationship model.

Context Compaction

Compress context to reduce token costs and optimize retrieval budgets.