FAQ - Maximem Synap

General

What is Synap?

Synap is a managed memory platform for AI agents. It provides a complete pipeline for ingesting conversations and documents, extracting structured knowledge (facts, preferences, episodes, emotions, temporal events), resolving entities across conversations, and retrieving relevant context when your agent needs it.Instead of building and maintaining your own vector database, retrieval pipeline, and entity resolution system, you integrate the Synap SDK into your application and let the platform handle the rest. Your agent gets long-term, structured memory with a few lines of code.

How is Synap different from RAG (Retrieval-Augmented Generation)?

Traditional RAG systems retrieve raw document chunks based on similarity search. Synap goes several steps further:

Structured extraction: Synap does not just store chunks. It extracts typed knowledge — facts, preferences, episodes, emotions, and temporal events — with confidence scores.
Entity resolution: Mentions of the same entity across conversations (e.g., “John”, “my manager”, “John Smith”) are linked to a single canonical entity.
Scoped retrieval: Memories are scoped to users, customers, and organizations. Each user gets their own memory without manual isolation logic.
Context compaction: Long conversation histories are automatically summarized while preserving key information, reducing token usage.
Managed pipeline: No vector databases to deploy, no embedding models to tune, no retrieval pipelines to build.

How does Synap compare to Mem0, Zep, Letta, and SuperMemory?

All of these are memory layers for AI agents, and at a 30,000-ft view they overlap. The differences that matter in practice:

Capability	Synap	Mem0	Zep	Letta	SuperMemory
Typed memories (facts / preferences / episodes / emotions / temporal)	Native, per-type retrieval	Single “memory” type	Facts + episodes	Single “memory” type	Single “memory” type
Entity resolution across conversations	Yes (graph store, automatic)	Limited	Yes (graph store)	No	No
Multi-scope (user / customer / client / world)	Native scope chain	User only	User only	User only	User only
Customized Memory Architecture (MACA)	Yes — generated from a Use-Case Markdown spec	Manual prompt tuning	Manual config	Manual schema	Manual config
Context compaction (auto-summarize long history)	Built-in (`context.compact`)	No	Limited	No	No
Self-host option	Cloud only (managed)	Self-host + cloud	Self-host + cloud	Self-host	Cloud only
Anticipation cache (gRPC stream of likely-needed memories)	Yes	No	No	No	No
B2B-native (customer/org isolation, MACA-per-instance)	Yes	No (user-only)	No (user-only)	No (user-only)	No (user-only)

When Synap is the right choice: you’re building a B2B agent product where each customer org has shared context (policies, runbooks, product data) on top of per-user memory; you want typed extraction so the LLM can reason over preferences vs. facts vs. temporal events distinctly; you don’t want to run a vector DB.When another tool is a better fit: you need self-host today (Mem0 / Zep / Letta); you only have a single-user consumer app and don’t need scope chain or entity graph (Mem0 / SuperMemory are simpler); you want agent state + memory in one SDK (Letta).

When should I build my own memory layer instead of using Synap?

Build your own if any of these apply:

You have strict data residency or air-gap requirements that managed cloud can’t meet, and your prospective scale doesn’t justify Synap’s self-hosted licensing.
Your memory model is highly domain-specific (e.g., medical records with regulated taxonomies) and you’d end up reimplementing extraction anyway.
You’re at single-digit MAU and a pgvector table + a few prompt-engineered extraction calls is genuinely cheaper than the integration overhead.

Don’t build your own if you’re just worried about “lock-in” or “wanting control.” The honest cost of running a production memory pipeline — embeddings, vector store, graph store, entity resolution, compaction, eviction, observability — is a multi-engineer-quarter project that no agent team has gotten right on a side budget.

Is my data secure?

Yes. Synap is designed with a zero-trust security model:

Encryption in transit: All connections use TLS 1.3.
Encryption at rest: All stored data is encrypted at rest using AES-256.
Instance isolation: Each instance has its own storage namespace. Memories from one instance are never accessible from another.
Scope isolation: Within an instance, memories are scoped to users and customers. A user can only access memories in their scope chain.
Credential management: API keys are hashed (SHA-256) before storage. Plaintext keys are never stored on the server.

See Authentication for details on the credential lifecycle.

What regions is Synap available in?

Synap Cloud is currently available in US East (Virginia) and EU West (Frankfurt). Additional regions are planned based on demand. Contact sales@maximem.ai for region-specific requirements or data residency needs.

SDK

Which programming languages are supported?

The official Synap SDK is available for Python 3.11+. It is fully async, built on asyncio, and available via pip:

pip install maximem-synap

A JavaScript/TypeScript SDK is available (@maximem/synap-js-sdk) — note it requires a Python 3.11+ runtime on the host since it wraps the Python SDK as a subprocess. Native TypeScript and Go SDKs are on the roadmap. Check the Changelog for updates on new language support.

Is the SDK async-only?

Yes. All SDK methods are async and must be called with await inside an async context. This design ensures your application never blocks on network I/O.If you need to call the SDK from synchronous code, use asyncio.run():

import asyncio
from maximem_synap import MaximemSynapSDK

sdk = MaximemSynapSDK(api_key="synap_your_key_here")

# From synchronous code
result = asyncio.run(sdk.memories.create(
    document="User prefers dark mode.",
    document_type="ai-chat-conversation",
    user_id="user_123",
    customer_id="acme_corp",
    mode="fast",
))

How do I handle SDK errors?

The SDK raises typed exceptions that map to HTTP error codes. Catch specific exceptions for fine-grained error handling:

import uuid
from maximem_synap import (
    AuthenticationError,       # 401, 403
    ContextNotFoundError,      # 404
    RateLimitError,            # 429
    ServiceUnavailableError,   # 500, 503
    InvalidInputError,         # 400
)

try:
    context = await sdk.conversation.context.fetch(
        conversation_id=str(uuid.uuid4()),
        search_query=["user preferences"],
    )
except RateLimitError as e:
    # Automatic retry with backoff is built into the SDK.
    # This exception is raised only after all retries are exhausted.
    print(f"Retry after {e.retry_after}s")
except ContextNotFoundError:
    print("Conversation not found")

The SDK automatically retries 429, 500, and 503 errors with exponential backoff. See Error Handling for the full reference.

Can I use Synap with LangChain, LlamaIndex, or other frameworks?

Yes. Synap is framework-agnostic. The SDK operates independently of your LLM orchestration layer. Common integration patterns:

LangChain: Use sdk.conversation.context.fetch() in a custom retriever, then pass the context to your chain.
LlamaIndex: Use sdk.conversation.context.compacted() with format="system_prompt" and inject it into your query engine.
Direct: Call the SDK from your application code and pass context to any LLM API.

See the First Integration guide for detailed examples.

Memory

How long are memories stored?

Each Instance has a retention policy that Synap chooses automatically based on your use-case file. Compliance-sensitive agents get longer retention with archive-on-expiry; consumer agents get shorter retention with automatic eviction. Frequently accessed memories are kept longer; rarely accessed ones are aged out sooner.You can also delete individual memories at any time via the Delete Memory endpoint, regardless of the retention policy.

Can I delete specific memories?

Yes. Use the DELETE /v1/memories/{memory_id} endpoint to permanently delete a specific memory. Deletion removes the memory from both the vector store and graph store. Entity references are updated but the entities themselves are not deleted, as they may be referenced by other memories.

await sdk.memories.delete("mem_a1b2c3d4e5f67890")

Deletion is permanent and cannot be undone. See the Memory API for details.

What is the difference between fast and long-range / accurate mode?

The mode parameter controls a speed-quality tradeoff. Ingestion and retrieval use distinct mode value sets:Ingestion (sdk.memories.create()) — values: "fast" or "long-range" (default).

Mode	Speed	Quality	Best For
`fast`	Highest	Good	Real-time chat ingestion, high-volume streams
`long-range`	Moderate	Highest	Important documents, support tickets, onboarding conversations

Retrieval (sdk.conversation.context.fetch()) — values: "fast" (default) or "accurate".

Mode	Latency	Method	Best For
`fast`	~50-100ms	Vector similarity only	Real-time chat, single-topic queries
`accurate`	~200-500ms	Vector + graph + re-ranking	Relationship-aware queries, multi-entity context

The two value spaces are not interchangeable. Passing "accurate" to memories.create() or "long-range" to context.fetch() will be rejected.

What happens if I ingest the same document twice?

If you provide a document_id in the create memory request, Synap checks for duplicates. If a document with the same ID has already been ingested, the request is rejected with a 409 Conflict error.If you do not provide a document_id, the document is ingested as a new record. The extraction pipeline may produce duplicate memories if the content overlaps with previously ingested documents. Entity resolution helps by linking entities across documents, but the memories themselves are stored independently.For production use, we recommend always providing a document_id for deduplication.

Configuration

How do I change my Instance's memory behavior?

Synap auto-generates each Instance’s memory configuration from the Use-Case Markdown file you upload. To change behavior — enable different memory categories, shift the primary scope, update retention guidance — re-upload an updated use-case file in the Dashboard. Synap re-evaluates and applies the new configuration. The previous version is retained so you can roll back if needed.

Does updating the configuration cause downtime?

No. Configuration updates are zero-downtime: in-flight requests complete on the previous configuration and new requests pick up the new one. There is no traffic interruption.

What happens to existing memories when configuration changes?

Existing memories keep their original scope and category assignments. The updated configuration governs new memories ingested after it takes effect, and it tunes retrieval/ranking behavior going forward. Memory data is not retroactively rewritten.

Billing and Usage

How is usage calculated?

Synap usage is measured across three dimensions:

API calls: Each HTTP request to the API counts as one API call. Batch endpoints count as a single call regardless of batch size.
Token usage: LLM tokens consumed during ingestion (extraction, categorization) and retrieval (re-ranking, compaction). Input and output tokens are tracked separately.
Storage: Total memories stored across all instances. Measured as a monthly peak.

Use the Dashboard Analytics to monitor your usage in real time.

What counts as an API call?

Each HTTP request to any Synap API endpoint counts as one API call, including:

Memory ingestion (single and batch)
Context fetch and compaction
Configuration operations
Dashboard queries
Analytics queries
Status checks

Webhook deliveries do not count as API calls.

Troubleshooting

Why am I getting AuthenticationError?

Common causes and solutions:

Missing or malformed API key: Ensure the header is Authorization: Bearer synap_... with the Bearer prefix.
Revoked key: Check the Dashboard to verify the key is still active.
Wrong instance: The API key may not have access to the instance you are targeting.

See Error Codes for the full list of auth-related errors.

Why are my memories not being retrieved?

If context fetch returns empty results when you expect matches:

Check ingestion status: Verify the ingestion completed successfully via GET /v1/memories/{ingestion_id}/status. Memories are not retrievable until ingestion completes.
Check scope: Memories are scoped to the user/customer that was specified during ingestion. Context fetch only returns memories within the conversation’s scope chain.
Check confidence threshold: Memories with confidence below the MACA threshold (default 0.7) are discarded during ingestion.
Check memory types: If you are filtering by types in the fetch request, ensure the desired types are included.
Check context budget: If the budget is very small, only the highest-ranked memories may fit.

Use the Dashboard monitoring tools to inspect the ingestion pipeline and stored memories for debugging.

How do I debug context retrieval issues?

Steps for diagnosing retrieval problems:

Get the correlation ID: Note the X-Correlation-Id from the fetch response.
Check analytics: Use GET /v1/analytics/latency?operation=context_fetch to see if latency is abnormal.
Try different modes: Switch from fast to accurate mode to see if graph traversal finds additional results.
Broaden the query: Try more general search queries or remove type filters.
Check compaction: If the context was recently compacted, some memories may have been summarized away. Use format: "full" to see both the narrative and structured extractions.

If the issue persists, contact support with the correlation ID and instance ID.

Documentation Index

​General

​SDK

​Memory

​Configuration

​Billing and Usage

​Troubleshooting

General

SDK

Memory

Configuration

Billing and Usage

Troubleshooting