FastAPI

This guide explains how to expose SwarmForge through FastAPI, choose between the generic and bound app shapes, and manage state, stores, and tool execution over HTTP. It does not explain swarm authoring from scratch or provider selection in depth.

This guide is for backend developers who want to add an HTTP layer on top of an existing swarm runtime. It assumes that you already understand the basic runtime objects and can configure a provider-backed model on the server.

After reading this guide, you should be able to:

choose between create_fastapi_app(...) and create_swarm_app(...)
expose the agent-builder authoring transport with create_authoring_app(...)
configure server-side model defaults
manage session state and persistence
add tools to single-agent and multi-agent FastAPI apps

The swarmforge.api package exposes the runtime through FastAPI without changing the underlying orchestration model. The HTTP layer still uses the same session, handoff, tool, checkpoint, and event flow that process_swarm_stream(...) uses in direct Python integrations.

FastAPI overview

Install

bash

pip install "swarmforge[api]"

Before sending requests that should hit a real hosted model, configure the provider on the server side:

MODEL_PROVIDER
LLM_MODEL
the matching provider API key such as OPENROUTER_API_KEY, GEMINI_API_KEY, GOOGLE_API_KEY, or OPENAI_API_KEY

Optional OpenRouter attribution:

OPENROUTER_SITE_URL
OPENROUTER_APP_NAME

Optional bind config used by the example servers:

SWARMFORGE_HOST default: 127.0.0.1
SWARMFORGE_PORT default: 8000

Start the server

Installed-package path:

bash

uvicorn swarmforge.api.fastapi:create_fastapi_app --factory --reload

Repository-source path:

bash

uvicorn --app-dir src swarmforge.api.fastapi:create_fastapi_app --factory --reload

Useful companion examples:

examples/fastapi_server.py for the generic JSON transport
examples/fastapi_swarm.py for the code-defined bound-swarm pattern

API shapes

SwarmForge supports two FastAPI integration shapes:

Factory	Who defines the swarm	Route style	Best fit
`create_fastapi_app(...)`	The client sends swarm JSON	`/v1/...`	graph builders, visual editors, dynamic or multi-tenant swarms
`create_swarm_app(...)`	Your Python app defines the swarm once	`/sessions/...` and `/swarm`	product backends with a fixed single-agent or multi-agent workflow

create_fastapi_app(...) accepts:

default_model_config
session_store
turn_runner_factory
extract_required_variables
state_manager_factory
tool_registry
tool_state
cors_allow_origins

global_variable_manager_factory still works as a compatibility alias.

Both FastAPI factories use the server-side default_model_config when provided, or ModelConfig() from environment variables when omitted. The HTTP request payload does not select or override the provider.

Use create_fastapi_app(...) when the swarm itself is request data. Use create_swarm_app(...) when your backend owns the swarm definition and wants a smaller, typed route surface.

FastAPI stores

All session-backed FastAPI endpoints persist two runtime artifacts:

the current SwarmSession
the ordered list of SessionCheckpoint records produced during execution

InMemorySessionStore

create_fastapi_app(...) and create_swarm_app(...) both default to InMemorySessionStore().

That is the right default for:

local development
tests
demos
single-process deployments

Behavior:

sessions are keyed in memory by session id
checkpoints are appended in memory and returned in order
all state is lost on process restart
state is not shared across multiple workers or containers

Example:

python

from swarmforge.api import create_fastapi_app
from swarmforge.swarm import InMemorySessionStore


app = create_fastapi_app(session_store=InMemorySessionStore())

SessionStore contract

If you need persistence across restarts or multiple API instances, provide your own SessionStore.

Required operations:

get_session(session_id)
save_session(session)
append_checkpoint(checkpoint)
list_checkpoints(session_id)

Example skeleton:

python

from swarmforge.api import create_fastapi_app
from swarmforge.swarm import SessionStore


class YourDbSessionStore(SessionStore):
    async def get_session(self, session_id: str):
        ...

    async def save_session(self, session):
        ...

    async def append_checkpoint(self, checkpoint):
        ...

    async def list_checkpoints(self, session_id: str):
        ...


app = create_fastapi_app(session_store=YourDbSessionStore())

Use a custom store when you need:

durable sessions after restarts
shared sessions across multiple API instances
checkpoint audit trails
database-backed persistence with Postgres, Redis, DynamoDB, or another system

FastAPI state model

Client-supplied request state is the normal way to pass application facts such as account_id, priority, tenant_id, or region into the runtime.

Request state payloads

Both FastAPI factories accept state updates in request bodies:

json

{
  "state": {
    "account_id": "ACME-991",
    "priority": "high"
  }
}

Send a state object with the keys you want to set. The values are merged into the session — only the keys you send are updated, existing keys are left unchanged.

For backward compatibility, the API still accepts variables, global_variables.

Those values become visible in:

tool handlers through context.state or context.visible_state
dynamic prompts through SystemPromptContext.state
the active turn config through config.state

Provider selection is not part of the request state or message body. FastAPI uses the provider configured on the server through default_model_config or the standard provider environment variables.

State and runtime endpoints

Both FastAPI factories expose state inspection and update routes:

GET /.../sessions/{session_id}/state
PATCH /.../sessions/{session_id}/state

Legacy aliases remain available:

GET /.../sessions/{session_id}/variables
PATCH /.../sessions/{session_id}/variables

Runtime inspection is separate:

GET /.../sessions/{session_id}/runtime

The runtime payload returns:

reducer-aware global state
per-node history
per-node scratchpad/context
direct entry_node and current_node runtime views

This is runtime-visible working state, not opaque model chain-of-thought.

Optional state validation

SessionStateManager is the preferred name for the reducer-aware runtime state contract. GlobalVariableManager remains available as a compatibility alias.

Use a custom manager when you need to:

validate values before writes
normalize or coerce values before reducers run
add custom reducer behavior
keep HTTP and in-process state rules on the same contract

Example:

python

from swarmforge.api import create_swarm_app
from swarmforge.swarm import SessionStateManager


class ValidatingState(SessionStateManager):
    def normalize_value(self, key, value, *, session, current_node=None):
        if key == "account_id":
            return str(value).strip().upper()
        return value

    def validate_value(self, key, value, *, session, current_node=None, reducer_rule):
        if key == "priority" and value not in {"low", "normal", "high"}:
            raise ValueError("priority must be low, normal, or high")


app = create_swarm_app(
    SUPPORT_SWARM,
    state_manager=ValidatingState.from_swarm(SUPPORT_SWARM),
)

Single-agent FastAPI

Use a single-agent FastAPI app when your backend owns one assistant workflow and you want typed routes without sending swarm JSON on every request.

Single-agent usage

The usual shape is:

define one SwarmNode
bind it once with create_swarm_app(...)
optionally attach Python tools
create sessions and send messages through /sessions/...

This is a good fit for:

internal copilots
account assistants
single-lane support bots
APIs that should not accept arbitrary swarm definitions from clients

Single-agent example

python

from swarmforge.api import create_swarm_app
from swarmforge.evaluation.provider import ModelConfig
from swarmforge.swarm import SwarmDefinition, SwarmNode


ASSISTANT_SWARM = SwarmDefinition(
    id="assistant",
    name="Assistant Swarm",
    nodes=[
        SwarmNode(
            id="assistant",
            node_key="assistant",
            name="Assistant",
            system_prompt="You are a concise assistant.",
            is_entry_node=True,
        )
    ],
)


app = create_swarm_app(
    ASSISTANT_SWARM,
    default_model_config=ModelConfig(),
    title="Single-Agent API",
)

That app exposes:

GET /swarm
POST /sessions
GET /sessions/{session_id}
POST /sessions/{session_id}/messages
POST /sessions/{session_id}/messages/stream
the matching state and runtime endpoints

Single-agent tool usage

For a single-agent FastAPI app, the simplest path is:

declare the tool on the node with function_tool(...)
pass the callable through tool_registry when you create the app
let the model call the tool during a normal /sessions/{session_id}/messages request

python

from swarmforge.api import create_swarm_app
from swarmforge.evaluation.provider import ModelConfig
from swarmforge.swarm import SwarmDefinition, SwarmNode, function_tool


async def lookup_order(order_id: str, context=None, state=None):
    return {
        "order_id": order_id,
        "account_id": state.get("account_id"),
        "status": "shipped",
    }


ASSISTANT_SWARM = SwarmDefinition(
    id="assistant",
    name="Assistant Swarm",
    nodes=[
        SwarmNode(
            id="assistant",
            node_key="assistant",
            name="Assistant",
            system_prompt="Always call lookup_order before answering order-status questions.",
            enabled_tools=[function_tool(handler=lookup_order)],
            is_entry_node=True,
        )
    ],
)


app = create_swarm_app(
    ASSISTANT_SWARM,
    default_model_config=ModelConfig(),
    tool_registry={"lookup_order": lookup_order},
    title="Single-Agent API",
)

The request payload does not change for tools. The user still sends a normal message, and the runtime emits tool-related events if the model decides to call the tool.

Single-agent payloads

Create a session:

bash

curl -X POST http://127.0.0.1:8000/sessions \
  -H 'Content-Type: application/json' \
  -d '{
    "session_id": "assistant-1",
    "state": {
      "account_id": "ACME-991"
    }
  }'

Send a message:

bash

curl -X POST http://127.0.0.1:8000/sessions/assistant-1/messages \
  -H 'Content-Type: application/json' \
  -d '{
    "user_input": "Give me a concise account summary.",
    "state": {
      "priority": "high"
    }
  }'

Typical request body fields for the bound single-agent API:

Route	Required fields	Optional fields
`POST /sessions`	none	`session_id`, `state`
`POST /sessions/{session_id}/messages`	`user_input`	`state`

Multi-agent FastAPI

Use a multi-agent FastAPI app when your backend owns routing logic and wants stable HTTP routes while the swarm decides how to hand off between specialized nodes.

Multi-agent usage

The usual shape is:

define multiple SwarmNode entries with node-level sub_agents handoffs
add required_variables where routing depends on known facts
optionally attach Python tools to one or more nodes
optionally provide extract_required_variables(...)
bind the swarm once with create_swarm_app(...)

This is a good fit for:

support flows with triage and specialists
billing, refund, fraud, or escalation routing
product backends that need inspectable handoff events

Multi-agent example

python

from typing import Any, Dict

from swarmforge.api import create_swarm_app
from swarmforge.evaluation.provider import ModelConfig
from swarmforge.swarm import SwarmDefinition, SwarmNode, SwarmNodeSubAgent, SwarmVariable


SUPPORT_SWARM = SwarmDefinition(
    id="support",
    name="Support Swarm",
    nodes=[
        SwarmNode(
            id="triage",
            node_key="triage",
            name="Triage",
            system_prompt="You triage support requests and transfer when the right specialist is clear.",
            sub_agents=[
                SwarmNodeSubAgent(
                    sub_agent="billing",
                    handoff_description="Transfer billing issues after confirmation.",
                    required_variables=["account_id"],
                )
            ],
            is_entry_node=True,
        ),
        SwarmNode(
            id="billing",
            node_key="billing",
            name="Billing",
            system_prompt="You handle billing issues clearly and directly.",
        ),
    ],
    variables=[
        SwarmVariable(
            key_name="account_id",
            description="Customer account identifier",
            reducer_rule="overwrite",
        )
    ],
)


async def extract_required_variables(user_input: str = "", **_kwargs) -> Dict[str, Any]:
    if "ACME-991" in user_input:
        return {"account_id": "ACME-991"}
    return {}


app = create_swarm_app(
    SUPPORT_SWARM,
    default_model_config=ModelConfig(),
    extract_required_variables=extract_required_variables,
    title="Support API",
)

That app exposes the same session routes as the single-agent bound app plus:

handoff-capable runtime behavior
GET /swarm to inspect the bound swarm definition
state and runtime endpoints that show the active node and accumulated state

For advanced graph editing, you can still provide explicit edges, but sub_agents is the recommended authoring path.

Multi-agent tool usage

Multi-agent FastAPI apps can use both:

your own Python tools through enabled_tools plus tool_registry
the runtime-injected transfer_to_agent tool that comes from the compiled handoff graph (derived from node sub_agents and/or explicit edges)

python

from typing import Any, Dict

from swarmforge.api import create_swarm_app
from swarmforge.evaluation.provider import ModelConfig
from swarmforge.swarm import SwarmDefinition, SwarmNode, SwarmNodeSubAgent, SwarmVariable, function_tool


async def lookup_invoice(invoice_id: str, context=None, state=None):
    return {
        "invoice_id": invoice_id,
        "account_id": state.get("account_id"),
        "status": "overdue",
    }


SUPPORT_SWARM = SwarmDefinition(
    id="support",
    name="Support Swarm",
    nodes=[
        SwarmNode(
            id="triage",
            node_key="triage",
            name="Triage",
            system_prompt=(
                "Use lookup_invoice when invoice details are needed. "
                "Call transfer_to_agent when the request should move to Billing."
            ),
            enabled_tools=[function_tool(handler=lookup_invoice)],
            sub_agents=[
                SwarmNodeSubAgent(
                    sub_agent="billing",
                    handoff_description="Transfer billing issues after confirmation.",
                    required_variables=["account_id"],
                )
            ],
            is_entry_node=True,
        ),
        SwarmNode(
            id="billing",
            node_key="billing",
            name="Billing",
            system_prompt="You handle billing issues clearly and directly.",
        ),
    ],
    variables=[SwarmVariable(key_name="account_id", reducer_rule="overwrite")],
)


async def extract_required_variables(user_input: str = "", **_kwargs) -> Dict[str, Any]:
    if "ACME-991" in user_input:
        return {"account_id": "ACME-991"}
    return {}


app = create_swarm_app(
    SUPPORT_SWARM,
    default_model_config=ModelConfig(),
    extract_required_variables=extract_required_variables,
    tool_registry={"lookup_invoice": lookup_invoice},
    title="Support API",
)

As with the single-agent case, clients still send normal message payloads. Tool execution and handoffs happen inside the runtime and show up in the event stream and runtime/session responses.

Multi-agent payloads

Create a session:

bash

curl -X POST http://127.0.0.1:8000/sessions \
  -H 'Content-Type: application/json' \
  -d '{
    "session_id": "support-session-1",
    "state": {
      "priority": "high"
    }
  }'

Send a message:

bash

curl -X POST http://127.0.0.1:8000/sessions/support-session-1/messages \
  -H 'Content-Type: application/json' \
  -d '{
    "user_input": "I need help with a charge on account ACME-991."
  }'

Inspect runtime state after the handoff:

bash

curl http://127.0.0.1:8000/sessions/support-session-1/runtime

Typical request body fields for the bound multi-agent API:

Route	Required fields	Optional fields
`POST /sessions`	none	`session_id`, `state`
`POST /sessions/{session_id}/messages`	`user_input`	`state`

Fast mode

Fast mode lets a node run designated tools immediately, without invoking the model, when the user message matches a known operational pattern. The runtime emits a fast_tool_use event and then a done event with a summarized result.

Enabling fast mode

Set behavior_config.fast_mode_enabled=true on the node and mark one or more tools with fast_enabled=true:

python

from swarmforge.api import create_swarm_app
from swarmforge.swarm import SwarmDefinition, SwarmNode, function_tool


async def lookup_order(order_id: str):
    return {"summary": f"Order {order_id} has shipped."}


OPS_SWARM = SwarmDefinition(
    id="ops",
    name="Ops Swarm",
    nodes=[
        SwarmNode(
            id="ops",
            node_key="ops",
            name="Ops",
            system_prompt="You are an operations specialist.",
            enabled_tools=[
                {
                    **function_tool(
                        name="lookup_order",
                        description="Fetch order status",
                        parameters={
                            "type": "object",
                            "properties": {"order_id": {"type": "string"}},
                            "required": ["order_id"],
                        },
                        handler=lookup_order,
                    ),
                    "fast_enabled": True,
                    "fast_default_args": {"order_id": "A-77"},
                }
            ],
            behavior_config={"fast_mode_enabled": True},
            is_entry_node=True,
        )
    ],
)

Fast tool params from state

You can override fast-tool arguments per-turn by sending fast_tool_params in the request state:

bash

curl -X POST http://127.0.0.1:8000/sessions/ops-1/messages \
  -H 'Content-Type: application/json' \
  -d '{
    "user_input": "Track my order",
    "state": {
      "fast_tool_params": {
        "lookup_order": {"order_id": "A-99"}
      }
    }
  }'

The runtime resolves arguments in this priority order (highest wins):

fast_tool_params from visible state (per-turn overrides)
Visible state keys matching parameter names (auto-inferred)
fast_default_args on the tool definition (static defaults)

mode: "fast" parameters skip step 2 (state inference) so visible state keys won't accidentally overwrite response templates.

Fast mode works for both bound swarms (/sessions/.../messages) and the generic transport (/v1/swarm/run). Streaming endpoints emit the same fast_tool_use event before the final done event.

Response templates with `mode: fast`

You can mark tool parameters with "mode": "fast" to define response message templates directly in the tool schema:

python

async def lookup_order(order_id: str, order_success: str = None, order_not_success: str = None):
    order_success = order_success or "Your order has been processed successfully!"
    order_not_success = order_not_success or "We encountered an issue processing your order."
    ...
    return order_success if success else order_not_success


{
    **function_tool(
        name="lookup_order",
        parameters={
            "type": "object",
            "properties": {
                "order_id": {"type": "string", "description": "The order ID"},
                "order_success": {
                    "type": "string",
                    "mode": "fast",
                    "description": "Message to say if the order succeeded",
                },
                "order_not_success": {
                    "type": "string",
                    "mode": "fast",
                    "description": "Message to say if the order failed",
                },
            },
            "required": ["order_id"],
        },
        handler=lookup_order,
    ),
    "fast_enabled": True,
}

In normal mode: the LLM sees order_success and order_not_success as regular parameters and fills them with context-aware messages based on their description. Any parameter marked with "mode": "fast" is automatically added to the tool schema required array, so the model is forced to provide it during tool calls. The tool handler receives the generated values and returns the one that matches the outcome.

In fast mode: these params are excluded from state-inference auto-matching (visible state keys won't accidentally override them). They can still be provided via fast_default_args or fast_tool_params. The tool handler's return value (a string) is used directly as the assistant message — skipping the generic "Fast mode completed N tool(s)" summary.

Recommendation: define mode: "fast" parameters in positive/negative pairs (for example: order_success and order_not_success) so the handler can deterministically return the one that matches the tool outcome, and that returned string becomes the assistant response in fast mode.

This pattern is useful for:

Pre-authored responses that vary by outcome (success / failure / not found)
Keeping LLM-generated messages in normal mode while having deterministic fallbacks in fast mode
Reducing latency by eliminating the summary wrapping step

See the fast mode response templates example for a runnable demo.

Authoring transport

Use the authoring transport when your app is not sending user messages into a swarm directly, but is instead guiding a user through a chat-like builder flow that ends in a generated system prompt.

This surface is exposed by create_authoring_app(...) from swarmforge.api.

Authoring transport usage

The usual shape is:

call POST /v1/authoring/agent-builder/start with the initial user request
store the returned serialized session
send each user reply through POST /v1/authoring/agent-builder/onboarding-turn
call POST /v1/authoring/agent-builder/confirm once the session enters confirm_generate
call POST /v1/authoring/agent-builder/generate or POST /v1/authoring/agent-builder/generate/stream

For form-first voice-agent creation, send a complete VoiceAgentBuildSpec to POST /v1/authoring/agent-builder/voice/run. That route skips conversational onboarding and returns prompt-building artifacts only.

This is a good fit for:

prompt builders
assistant creation wizards
internal tools that need a chat-first prompt authoring experience
frontends that want live phase updates during prompt generation

Authoring transport example

python

from swarmforge.api import create_authoring_app


app = create_authoring_app(title="SwarmForge Authoring API")

Repository bootstrap example: python examples/fastapi_authoring_server.py

Chat-style authoring flow

Unlike the runtime APIs, the authoring transport is session-in-payload rather than session-by-id. The client should keep the latest serialized session and send it back on every onboarding, confirmation, and generation request.

Typical flow:

the user sends a seed request such as "Build me a refund support assistant"
the server replies with opening_message and an initial serialized session
the user answers the next onboarding question
the client sends the returned session plus user_message
the server replies with reply, extracted fields, the next stage, and the updated session
once the stage becomes confirm_generate, the client asks for explicit confirmation
after confirmation, the client calls generate or generate/stream

This route shape keeps the HTTP transport stateless while preserving the full builder conversation in the serialized session.

Stream endpoint

POST /v1/authoring/agent-builder/generate/stream returns text/event-stream and emits authoring progress as SSE events.

Example request:

bash

curl -N -X POST http://127.0.0.1:8000/v1/authoring/agent-builder/generate/stream \
    -H 'Content-Type: application/json' \
    -d '{
        "session": {
            "seed_request": "Build me a refund support assistant",
            "topic": "Support assistant for order refunds",
            "category_type": "persona",
            "category_label": "Support Agent Builder",
            "onboarding_prompt": "Ask adaptive follow-up questions and use fields as memory.",
            "stage": "generating",
            "collected_inputs": {
                "intent": "Resolve refund requests quickly",
                "audience_channel": "Customer support web chat"
            },
            "fields": [],
            "steps": [],
            "messages": []
        },
        "generate_tools": true,
        "tools_format": "openai"
    }'

The stream emits these events:

start when generation begins
phase when a generation phase completes, such as research, phase_a, phase_b, phase_c, phase_d, title, or tools
generation_result with the final GenerateAgentPromptResult payload
session with the final serialized builder session
done on terminal success
error on terminal failure

Example event sequence:

text

event: start
data: {"stage":"generating"}

event: phase
data: {"phase":"research", ...}

event: phase
data: {"phase":"phase_a", ...}

event: generation_result
data: {"prompt_title":"Refund Support Prompt", ...}

event: done
data: {"status":"completed"}

Use this route when the frontend needs user-visible progress. Use the non-streaming generate route when one final payload is enough.

One-shot voice-agent build endpoint

POST /v1/authoring/agent-builder/voice/run accepts a typed voice-agent build spec and returns:

system_prompt
tools_schema
call_flow
summary_schema
eval_scenarios
research_summary, sources, and prompt_title

The guardrails block is optional but recommended. Use it for policy boundaries that should shape the generated prompt:

safety_rules
refusal_rules
forbidden_actions
disallowed_topics
confirmation_required_for
tool_use_rules
fallback_response
hallucination_policy
max_tool_retries

Example request:

bash

curl -X POST http://127.0.0.1:8000/v1/authoring/agent-builder/voice/run \
    -H 'Content-Type: application/json' \
    -d '{
        "tools_format": "openai",
        "spec": {
            "version": "1.0",
            "template": {"id": "real_estate_voice", "category": "voice_call_agent"},
            "business": {
                "company_name": "Atlas Realty",
                "industry": "real estate",
                "market": "Casablanca",
                "languages": ["en", "fr", "ar"]
            },
            "call": {
                "direction": "inbound",
                "main_goal": "qualify property buyers and book viewings",
                "target_caller": "property buyer"
            },
            "knowledge_base": {
                "enabled": true,
                "knowledge_base_ids": ["properties"],
                "search_required_for": ["availability", "pricing"]
            },
            "tools": {
                "generate_tools": true,
                "allowed_tool_types": ["function", "apiRequest", "query", "sms", "endCall", "webhook"],
                "required_logical_tools": ["book_viewing"]
            },
            "data_collection": {
                "required_entities": ["name", "phone", "budget", "preferred_location"]
            },
            "guardrails": {
                "safety_rules": ["Do not provide legal, medical, or financial advice."],
                "refusal_rules": ["Refuse requests to bypass identity, consent, or booking rules."],
                "forbidden_actions": ["Do not promise property availability without checking tools."],
                "confirmation_required_for": ["booking", "sms", "crm_update"],
                "tool_use_rules": ["Use knowledge search before pricing or availability answers."]
            }
        }
    }'

Endpoint surface

Generic JSON transport endpoints

Routes exposed by create_fastapi_app(...):

GET /health
POST /v1/swarm/run
POST /v1/swarm/run/stream
POST /v1/sessions
GET /v1/sessions/{session_id}
GET /v1/sessions/{session_id}/state
PATCH /v1/sessions/{session_id}/state
GET /v1/sessions/{session_id}/variables
PATCH /v1/sessions/{session_id}/variables
GET /v1/sessions/{session_id}/runtime
POST /v1/sessions/{session_id}/messages
POST /v1/sessions/{session_id}/messages/stream

Use this shape when the swarm is dynamic request data.

Bound swarm endpoints

Routes exposed by create_swarm_app(...):

GET /health
GET /swarm
POST /sessions
GET /sessions/{session_id}
GET /sessions/{session_id}/state
PATCH /sessions/{session_id}/state
GET /sessions/{session_id}/variables
PATCH /sessions/{session_id}/variables
GET /sessions/{session_id}/runtime
POST /sessions/{session_id}/messages
POST /sessions/{session_id}/messages/stream

Use this shape when the swarm is code-defined once inside your FastAPI service.

Authoring transport endpoints

Routes exposed by create_authoring_app(...):

GET /health
POST /v1/authoring/agent-builder/start
POST /v1/authoring/agent-builder/onboarding-turn
POST /v1/authoring/agent-builder/confirm
POST /v1/authoring/agent-builder/generate
POST /v1/authoring/agent-builder/generate/stream
POST /v1/authoring/agent-builder/run
POST /v1/authoring/agent-builder/voice/run

Use this shape when your API is guiding a prompt-authoring conversation rather than running a swarm turn directly.

Transport behavior

Both FastAPI shapes support:

SSE streaming for incremental runtime events
CORS support for browser clients
OpenAPI request examples that assume server-configured provider defaults
injected session persistence through the SessionStore interface

Streaming endpoints emit runtime events first, then final session and checkpoints events.

When the model supports token streaming, each turn emits incremental text_chunk events before the final done event:

json

{ "event": "text_chunk", "data": { "text_chunk": "Hello", "agentName": "Assistant" } }
{ "event": "text_chunk", "data": { "text_chunk": " there", "agentName": "Assistant" } }
{ "event": "done",       "data": { "fullText": "Hello there", "agentName": "Assistant" } }

Multimodal input

Both FastAPI shapes accept multimodal user_input — the same field that normally holds a plain string can instead carry a list of OpenAI-compatible content parts: text, images (URL or base64), and audio (base64).

Supported content part types

Part type	Description	Provider support
`text`	Plain text instruction or question	All providers
`image_url`	Image from a public HTTPS URL	OpenRouter vision models (GPT-4o, Claude 3.5, Gemini …), OpenAI API
`image_url` with `data:` URI	Base64-encoded image embedded in a data URI	Same as above
`input_audio`	Base64-encoded audio clip	OpenRouter + OpenAI audio-capable models (e.g. `openai/gpt-4o-audio-preview`)

Sending multimodal content over HTTP

Replace the user_input string with a JSON array of content-part objects.

Image from a URL:

bash

curl -X POST http://127.0.0.1:8000/sessions/my-session/messages \
  -H 'Content-Type: application/json' \
  -d '{
    "user_input": [
      { "type": "text",      "text": "What is in this image?" },
      { "type": "image_url", "image_url": { "url": "https://example.com/photo.jpg", "detail": "auto" } }
    ]
  }'

Base64-encoded image:

bash

curl -X POST http://127.0.0.1:8000/sessions/my-session/messages \
  -H 'Content-Type: application/json' \
  -d '{
    "user_input": [
      { "type": "text",      "text": "Describe this image." },
      { "type": "image_url", "image_url": { "url": "data:image/png;base64,<BASE64_DATA>", "detail": "auto" } }
    ]
  }'

Audio clip (base64-encoded WAV):

bash

curl -X POST http://127.0.0.1:8000/sessions/my-session/messages \
  -H 'Content-Type: application/json' \
  -d '{
    "user_input": [
      { "type": "text",        "text": "Please transcribe this audio." },
      { "type": "input_audio", "input_audio": { "data": "<BASE64_DATA>", "format": "wav" } }
    ]
  }'

Mixed image and audio:

bash

curl -X POST http://127.0.0.1:8000/sessions/my-session/messages \
  -H 'Content-Type: application/json' \
  -d '{
    "user_input": [
      { "type": "text",        "text": "Describe the image and transcribe the audio." },
      { "type": "image_url",   "image_url": { "url": "https://example.com/photo.jpg" } },
      { "type": "input_audio", "input_audio": { "data": "<BASE64_DATA>", "format": "mp3" } }
    ]
  }'

The same array format works for POST /v1/sessions/{session_id}/messages (generic transport), POST /sessions/{session_id}/messages (bound swarm), the corresponding …/stream endpoints, and POST /v1/swarm/run.

Sending multimodal content from Python

Use the helper functions from swarmforge.swarm.multimodal:

python

from swarmforge.swarm import (
    image_url_part,
    image_base64_part,
    audio_base64_part,
    text_part,
    multimodal_content,
    process_swarm_stream,
)

# Image from URL
content = multimodal_content(
    text_part("What is in this image?"),
    image_url_part("https://example.com/photo.jpg"),
)

# Base64 image
with open("photo.png", "rb") as f:
    content = multimodal_content(
        text_part("Describe this image."),
        image_base64_part(f.read(), media_type="image/png"),
    )

# Audio
with open("clip.wav", "rb") as f:
    content = multimodal_content(
        text_part("Transcribe this audio clip."),
        audio_base64_part(f.read(), format="wav"),
    )

async for event in process_swarm_stream(session, content, store=store, turn_runner=runner):
    ...

The multimodal_content(...) call returns a plain list[dict] — it is the same value you would put in the user_input field of the HTTP request body.

Model selection for multimodal

Multimodal requests need a model that supports the requested modalities.

Modality	Example OpenRouter model IDs
Text + images	`openai/gpt-4o`, `anthropic/claude-3-5-sonnet`, `google/gemini-2.0-flash`
Text + audio	`openai/gpt-4o-audio-preview`
Text + images + audio	`openai/gpt-4o-audio-preview`

Set the model on the server side through ModelConfig or the LLM_MODEL environment variable:

python

from swarmforge.api import create_swarm_app
from swarmforge.evaluation.provider import ModelConfig

app = create_swarm_app(
    MY_SWARM,
    default_model_config=ModelConfig(
        provider="openrouter",
        model="openai/gpt-4o",
    ),
)

The HTTP request payload does not override the server-side model. Configure the correct model on the server before sending multimodal requests.

Authoring Transport

Single-Agent FastAPI

Multi-Agent FastAPI

SDK Evaluation Tutorial

FastAPI Evaluation Tutorial

Examples and Artifacts

FastAPI ​

FastAPI overview ​

Install ​

Start the server ​

API shapes ​

FastAPI stores ​

InMemorySessionStore ​

SessionStore contract ​

FastAPI state model ​

Request state payloads ​

State and runtime endpoints ​

Optional state validation ​

Single-agent FastAPI ​

Single-agent usage ​

Single-agent example ​

Single-agent tool usage ​

Single-agent payloads ​

Multi-agent FastAPI ​

Multi-agent usage ​

Multi-agent example ​

Multi-agent tool usage ​

Multi-agent payloads ​

Fast mode ​

Enabling fast mode ​

Fast tool params from state ​

Response templates with mode: fast ​

Authoring transport ​

Authoring transport usage ​

Authoring transport example ​

Chat-style authoring flow ​

Stream endpoint ​

One-shot voice-agent build endpoint ​

Endpoint surface ​

Generic JSON transport endpoints ​

Bound swarm endpoints ​

Authoring transport endpoints ​

Transport behavior ​

Multimodal input ​

Supported content part types ​

Sending multimodal content over HTTP ​

Sending multimodal content from Python ​

Model selection for multimodal ​

FastAPI

FastAPI overview

Install

Start the server

API shapes

FastAPI stores

InMemorySessionStore

SessionStore contract

FastAPI state model

Request state payloads

State and runtime endpoints

Optional state validation

Single-agent FastAPI

Single-agent usage

Single-agent example

Single-agent tool usage

Single-agent payloads

Multi-agent FastAPI

Multi-agent usage

Multi-agent example

Multi-agent tool usage

Multi-agent payloads

Fast mode

Enabling fast mode

Fast tool params from state

Response templates with `mode: fast`

Authoring transport

Authoring transport usage

Authoring transport example

Chat-style authoring flow

Stream endpoint

One-shot voice-agent build endpoint

Endpoint surface

Generic JSON transport endpoints

Bound swarm endpoints

Authoring transport endpoints

Transport behavior

Multimodal input

Supported content part types

Sending multimodal content over HTTP

Sending multimodal content from Python

Model selection for multimodal