Pipecat - Maximem Synap

Add persistent memory to a Pipecat voice pipeline as two frame processors: one that injects the user’s relevant memories before the LLM sees a frame, and one that records each completed turn after the response.

Overview

This guide shows how to add Synap to a Pipecat application to build voice pipelines that:

Inject the most relevant memories as a system message before the LLM responds
Record each completed user + assistant turn back to Synap
Compose with any other Pipecat frame processor without changing the pipeline shape

The Synap Pipecat integration ships two frame processors — both follow Pipecat’s processor contract, so they slot into any pipeline alongside STT, LLM, and TTS.

Class	Pipeline position	Purpose
`SynapMemoryProcessor`	Before LLM	Prepends relevant memories to the system message
`SynapRecorder`	After response	Records the completed turn back to Synap

Setup

Install the package alongside Pipecat:

pip install maximem-synap-pipecat pipecat-ai

Configure your API key. Generate one from the Synap Dashboard.

.env

SYNAP_API_KEY=synap_your_key_here
OPENAI_API_KEY=your-openai-api-key

Initialize the SDK once at the worker’s startup:

from maximem_synap import MaximemSynapSDK

sdk = MaximemSynapSDK()
await sdk.initialize()

See SDK Initialization for the full lifecycle and configuration options.

Basic integration

The smallest useful integration adds both processors to a standard voice pipeline — memory injection before the LLM, recording after the response. No other pipeline changes are needed:

from pipecat.pipeline.pipeline import Pipeline
from synap_pipecat import SynapMemoryProcessor, SynapRecorder

memory = SynapMemoryProcessor(
    sdk=sdk,
    user_id="alice",
    customer_id="acme",   # optional — required for B2B instances
    max_results=6,
)

recorder = SynapRecorder(
    sdk=sdk,
    user_id="alice",
    customer_id="acme",
    conversation_id="call-001",   # optional; auto-generated if omitted
)

pipeline = Pipeline([
    transport.input(),
    stt,
    memory,           # inject memory before LLM
    user_aggregator,
    llm,
    tts,
    transport.output(),
    assistant_aggregator,
    recorder,         # record turn after response
])

Memory injection failures degrade gracefully — the frame passes through unmodified if context retrieval fails. Recording failures surface explicitly as SynapIntegrationError, which Pipecat’s frame-error handling catches and logs.

Core concepts

Memory processor

SynapMemoryProcessor intercepts LLMMessagesFrame events and prepends a system message containing the user’s relevant memories before the frame reaches the LLM service:

from synap_pipecat import SynapMemoryProcessor

memory = SynapMemoryProcessor(
    sdk=sdk,
    user_id="alice",
    customer_id="acme",
    max_results=6,
    mode="fast",          # "fast" or "accurate"
)

Voice latency is tight, so mode="fast" is the default. The two retrieval modes trade latency against comprehensiveness:

	`fast`	`accurate`
Latency	50-100ms	200-500ms
Search	Vector similarity	Vector + graph + re-ranking
Best for	Real-time chat	Multi-entity queries

Failures degrade gracefully — if context retrieval fails, the LLMMessagesFrame passes through unmodified rather than blocking the call.

Recorder

SynapRecorder intercepts TranscriptionFrame (user side) and LLMFullResponseEndFrame (assistant side) and ingests the completed turn into Synap asynchronously:

from synap_pipecat import SynapRecorder

recorder = SynapRecorder(
    sdk=sdk,
    user_id="alice",
    customer_id="acme",
    conversation_id="call-001",
)

Recording happens out-of-band — it never blocks the audio path. Write failures surface as SynapIntegrationError, which propagates through Pipecat’s frame-error handling so the failure is visible rather than silent.

Positioning in the pipeline

The two processors expect specific positions:

transport.input()
    │
    ▼
   STT
    │
    ▼
SynapMemoryProcessor  ← fetches context, prepends to system prompt
    │
    ▼
 UserAggregator
    │
    ▼
   LLM
    │
    ▼
   TTS
    │
    ▼
transport.output()
    │
    ▼
AssistantAggregator
    │
    ▼
 SynapRecorder        ← records completed user + assistant turn

SynapMemoryProcessor must be between STT and the user aggregator; SynapRecorder after the assistant aggregator. Any other placement will not see the right frame types.

Complete example: full voice pipeline with memory

The pattern below sets up an end-to-end voice pipeline with Synap-backed memory. Scope is pulled per-call from the transport’s connection metadata, so the same worker can serve multiple users:

from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.task import PipelineTask
from pipecat.services.openai import OpenAILLMService
from pipecat.transports.network.daily import DailyTransport
from synap_pipecat import SynapMemoryProcessor, SynapRecorder


async def build_pipeline(sdk, transport, user_id: str, customer_id: str | None = None) -> Pipeline:
    stt = ...   # your STT service
    llm = OpenAILLMService(model="gpt-4o")
    tts = ...   # your TTS service
    user_aggregator = ...
    assistant_aggregator = ...

    memory = SynapMemoryProcessor(
        sdk=sdk,
        user_id=user_id,
        customer_id=customer_id,
        max_results=6,
        mode="fast",
    )

    recorder = SynapRecorder(
        sdk=sdk,
        user_id=user_id,
        customer_id=customer_id,
    )

    return Pipeline([
        transport.input(),
        stt,
        memory,
        user_aggregator,
        llm,
        tts,
        transport.output(),
        assistant_aggregator,
        recorder,
    ])


# Usage — invoked per call
async def run_call(sdk, room_url: str, user_id: str, customer_id: str | None = None):
    transport = DailyTransport(room_url, ...)
    pipeline = await build_pipeline(sdk, transport, user_id=user_id, customer_id=customer_id)
    task = PipelineTask(pipeline)
    await task.run()

Three things to notice in this pattern:

Memory injection happens once per LLM turn, not once per audio frame — the processor only acts on LLMMessagesFrame events.
Recording is async and non-blocking. The audio path never waits on a Synap write.
Scope is per-call. Each run_call invocation gets its own processor instances with the right user_id / customer_id.

Advanced patterns

Multi-tenant scoping

Both processors accept the standard scoping triple — user_id (required), optional customer_id, optional conversation_id. customer_id is required on B2B Synap instances and ignored on single-tenant ones. See Memory Scopes.

memory = SynapMemoryProcessor(sdk=sdk, user_id="alice", customer_id="acme")

For multi-tenant deployments, build processors per call rather than caching them globally — each call should have its scope baked in.

Tuning retrieval mode

mode="fast" is the default and the right choice for most voice flows. Switch to "accurate" only for use cases where missing relevant memory is worse than adding ~150ms of pre-LLM latency.

Failure semantics

The integration follows the Synap-wide contract, adapted for voice latency:

SynapMemoryProcessor degrades gracefully — frame passes through unmodified if context retrieval fails.
SynapRecorder surfaces failures — raises SynapIntegrationError which Pipecat’s frame-error path catches and logs.

This is by design: a voice call should never break because of a transient memory glitch, but write failures must be visible to monitoring.

Next steps

LiveKit Agents

Context preloading and recording for LiveKit voice agents.

Claude Agent SDK

Hooks and MCP server for the Claude Agent SDK.

Context Fetch

The retrieval API behind SynapMemoryProcessor — modes, scopes, and response shapes.

Memory Scopes

How user_id, customer_id, and conversation_id interact across reads.

Documentation Index

​Overview

​Setup

​Basic integration

​Core concepts

​Memory processor

​Recorder

​Positioning in the pipeline

​Complete example: full voice pipeline with memory

​Advanced patterns

​Multi-tenant scoping

​Tuning retrieval mode

​Failure semantics

​Next steps