Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.maximem.ai/llms.txt

Use this file to discover all available pages before exploring further.

Give a LiveKit voice agent long-term memory across calls. Synap preloads the user’s history into the ChatContext before the first turn, records every committed turn during the session, and exposes search/store function tools the LLM can call mid-conversation.

Overview

This guide shows how to add Synap to a LiveKit Agents application to build voice agents that:
  • Start every call already aware of the user’s history — no warm-up turn needed
  • Record every committed turn back to Synap so memory grows with each call
  • Search and store memories mid-conversation when the model decides it’s relevant
The Synap LiveKit Agents integration ships four exports — two lifecycle helpers and two function tools.
ExportRolePurpose
preload_synap_contextLifecycle (start)Injects long-term memory into a ChatContext before the session begins
attach_synap_recordingLifecycle (during)Records every committed turn back to Synap
synap_search_toolFunction toolLLM-callable memory search
synap_store_toolFunction toolLLM-callable memory storage

Setup

Install the package alongside LiveKit Agents:
pip install maximem-synap-livekit-agents livekit-agents
Configure your API key. Generate one from the Synap Dashboard.
.env
SYNAP_API_KEY=synap_your_key_here
OPENAI_API_KEY=your-openai-api-key
LIVEKIT_API_KEY=your-livekit-key
LIVEKIT_API_SECRET=your-livekit-secret
Initialize the SDK once at the agent worker’s startup:
from maximem_synap import MaximemSynapSDK

sdk = MaximemSynapSDK()
await sdk.initialize()
See SDK Initialization for the full lifecycle and configuration options.

Basic integration

The smallest useful integration preloads memory before the session starts and attaches recording during it. The agent now opens with knowledge of the user’s history and grows its memory with every committed turn:
from livekit.agents import Agent, AgentSession, JobContext
from livekit.agents.llm import ChatContext
from synap_livekit_agents import preload_synap_context, attach_synap_recording


async def entrypoint(ctx: JobContext):
    await ctx.connect()

    chat_ctx = ChatContext()
    await preload_synap_context(
        chat_ctx=chat_ctx,
        sdk=sdk,
        user_id="alice",
        customer_id="acme",   # optional — required for B2B instances
        max_results=8,
    )

    agent = Agent(
        instructions="You are a voice assistant with long-term memory.",
        chat_ctx=chat_ctx,
    )

    session = AgentSession(...)

    conversation_id = attach_synap_recording(
        session=session,
        sdk=sdk,
        user_id="alice",
        customer_id="acme",
    )

    await session.start(agent=agent, room=ctx.room)
Preload failures degrade gracefully — the session starts with empty context and logs an error. Recording failures are also non-fatal — individual turn writes are retried, and persistent failures surface in logs rather than killing the call. For mid-conversation search and store (the model calls them when needed), add the function tools — see below.

Core concepts

preload_synap_context

Loads the user’s long-term memories as system messages in the ChatContext before the session starts. This gives the LLM awareness of the user’s history from the very first turn, with no tool call needed.
await preload_synap_context(
    chat_ctx=chat_ctx,
    sdk=sdk,
    user_id="alice",
    customer_id="acme",
    max_results=8,
    mode="fast",           # "fast" or "accurate"
)
Voice latency is tight, so mode="fast" is the default. Failures degrade gracefully — the session starts with empty context rather than raising.

attach_synap_recording

Subscribes to the AgentSession’s turn-commit events and ingests each committed turn asynchronously. Returns the conversation_id for the session (auto-generated if you don’t pass one):
conversation_id = attach_synap_recording(
    session=session,
    sdk=sdk,
    user_id="alice",
    customer_id="acme",
    conversation_id="call-001",   # optional; auto-generated if omitted
)
Recording happens out-of-band — it never blocks the audio path. Individual turn writes that fail are retried internally; persistent failures surface in logs.

Function tools

For mid-conversation lookups or saves, expose the search and store tools to the LLM. They are @llm.ai_callable-decorated functions, so the LiveKit LLM bridge picks them up as function calls automatically:
from synap_livekit_agents import synap_search_tool, synap_store_tool

agent = Agent(
    instructions="You are a voice assistant with long-term memory.",
    chat_ctx=chat_ctx,
    tools=[
        synap_search_tool(sdk=sdk, user_id="alice", max_results=5),
        synap_store_tool(sdk=sdk, user_id="alice"),
    ],
)
The scoping triple is bound at tool construction — the model only ever sees the query / content parameters, never user_id.

Complete example: voice agent with full memory loop

The pattern below assembles all four exports into a single entrypoint. The agent preloads context, records turns, and can search or store memories mid-call:
from livekit.agents import Agent, AgentSession, JobContext, WorkerOptions, cli
from livekit.agents.llm import ChatContext
from synap_livekit_agents import (
    preload_synap_context,
    attach_synap_recording,
    synap_search_tool,
    synap_store_tool,
)


async def entrypoint(ctx: JobContext):
    await ctx.connect()

    # Identify the caller (e.g. from a JWT claim on ctx.room)
    user_id = ctx.room.metadata.get("user_id", "anonymous")
    customer_id = ctx.room.metadata.get("customer_id")

    # 1. Preload long-term memory before the session begins
    chat_ctx = ChatContext()
    await preload_synap_context(
        chat_ctx=chat_ctx,
        sdk=sdk,
        user_id=user_id,
        customer_id=customer_id,
        max_results=8,
    )

    # 2. Build the agent with on-demand search/store tools
    agent = Agent(
        instructions=(
            "You are a voice assistant with long-term memory. "
            "Use synap_search for any question about the caller's history. "
            "Use synap_store when the caller shares a new fact or decision."
        ),
        chat_ctx=chat_ctx,
        tools=[
            synap_search_tool(sdk=sdk, user_id=user_id, customer_id=customer_id, max_results=5),
            synap_store_tool(sdk=sdk, user_id=user_id, customer_id=customer_id),
        ],
    )

    # 3. Attach turn recording so every committed turn enters Synap
    session = AgentSession(...)
    attach_synap_recording(
        session=session,
        sdk=sdk,
        user_id=user_id,
        customer_id=customer_id,
    )

    await session.start(agent=agent, room=ctx.room)


if __name__ == "__main__":
    cli.run_app(WorkerOptions(entrypoint_fnc=entrypoint))
Three things to notice in this pattern:
  1. Preload + record + tools form the loop. Preload reads memory in, record writes turns out, and tools let the model do explicit lookups in between.
  2. Scope is per-call. user_id and customer_id are pulled from room metadata, so each call has its own memory scope without any global state.
  3. Failure modes are voice-friendly. Memory operations never block the audio path; they degrade or retry silently so the caller hears no glitch.

Advanced patterns

Multi-tenant scoping

All four exports accept the standard scoping triple — user_id (required), optional customer_id, optional conversation_id. customer_id is required on B2B Synap instances and ignored on single-tenant ones. See Memory Scopes.
await preload_synap_context(
    chat_ctx=chat_ctx,
    sdk=sdk,
    user_id="alice",
    customer_id="acme",
)
For multi-tenant call centers, pull user_id / customer_id from room metadata or JWT claims at entrypoint time — never hardcode them.

Choosing preload vs. tools

  • Preload only — the LLM sees the most relevant memories from turn one. Best when memory is small and you want zero per-turn latency overhead.
  • Tools only — the LLM searches on demand. Best when memory is large and only a few queries need recall.
  • Both — production setups where the opening greeting can reference long-term context AND the model can dig deeper mid-call.

Failure semantics

The integration follows the Synap-wide contract, adapted for voice latency:
  • preload_synap_context degrades gracefully — empty context on failure, error logged.
  • attach_synap_recording retries internally — individual turn write failures are retried; persistent failures log but don’t crash the call.
  • synap_search_tool degrades gracefully — returns [] and logs on failure.
  • synap_store_tool surfaces failures — raises SynapIntegrationError so the model (and you) know if persistence failed.
This is by design: a voice call should never break because of a transient memory glitch, but explicit write failures must be visible.

Next steps

Pipecat

Frame processors for Pipecat voice pipelines.

Claude Agent SDK

Hooks and MCP server for the Claude Agent SDK.

Context Fetch

The retrieval API behind preload_synap_context and synap_search_tool.

Memory Scopes

How user_id, customer_id, and conversation_id interact across reads.