Building a Server AG-UI Endpoint

For agents that run server-side. If you can use a client adapter, you don't need this.

Most integrations skip this. If you can call your AI provider with one of the SDK's client adapters, do that — Choosing Your Integration shows the no-server path. Build an endpoint only when your agent must run server-side: it holds private tools, data, or keys, or you're wrapping a pre-existing agent. You expose one HTTP endpoint that streams AG-UI events as Server-Sent Events, and point the widget at it with sseStream.

The request your endpoint receives

sseStream sends a POST whose JSON body nests your CustomerAIParams under a params key — nested, not spread. Read body.params:

Request body

// POST to your endpoint, from sseStream("/api/agent", { params })
{
  "params": {
    "system": "…BranderUX instructions…",
    "messages": [{ "role": "user", "content": "…" }],
    "tools": { "anthropic": [], "openai": [], "gemini": [] },
    "max_tokens": 4000
  }
}

A backend that reads body.system / body.messages (assuming the params are spread at the top level) gets undefined and runs with no system prompt or tools — the screen renders empty. Always read body.params.

What you send back

Respond with Content-Type: text/event-stream, emit each AG-UI event as a data: {json} line, and finish with data: [DONE]. A run starts with RUN_STARTED, emits any mix of tool-call and text events, and ends with RUN_FINISHED (or RUN_ERROR).

Event	Key fields
`RUN_STARTED`	runId?, threadId?
`TEXT_MESSAGE_START`	messageId?, role?
`TEXT_MESSAGE_CONTENT`	delta — text to append
`TEXT_MESSAGE_END`	messageId?
`TOOL_CALL_START`	toolCallId, toolCallName
`TOOL_CALL_ARGS`	toolCallId, delta — JSON fragment
`TOOL_CALL_END`	toolCallId
`RUN_FINISHED`	runId?
`RUN_ERROR`	message

Node / Next.js

On Node you can reuse the SDK's translator — openaiStream (and anthropicStream / geminiStream) are framework-agnostic async generators that turn a provider stream into AG-UI events. Wrap them in an SSE response:

app/api/agent/route.ts

import { openaiStream } from "@brander/sdk";
import OpenAI from "openai";

const openai = new OpenAI();

// Next.js App Router route handler
export async function POST(req: Request) {
  const { params } = await req.json(); // the body is { params: {...} }

  const completion = await openai.chat.completions.create({
    model: "gpt-4o",
    messages: [
      ...(params.system ? [{ role: "system", content: params.system }] : []),
      ...params.messages,
    ],
    tools: params.tools?.openai,
    max_tokens: params.max_tokens ?? 4000,
    stream: true,
  });

  const encoder = new TextEncoder();
  const stream = new ReadableStream({
    async start(controller) {
      const send = (e: unknown) =>
        controller.enqueue(encoder.encode("data: " + JSON.stringify(e) + "\n\n"));
      // Reuse the SDK's translator — it yields AG-UI events from a provider stream
      for await (const event of openaiStream(completion)) send(event);
      controller.enqueue(encoder.encode("data: [DONE]\n\n"));
      controller.close();
    },
  });

  return new Response(stream, {
    headers: { "Content-Type": "text/event-stream", "Cache-Control": "no-cache" },
  });
}

Python / FastAPI

There's no Python SDK, so translate the provider stream yourself. This reference mirrors the SDK's OpenAI translation — drop it into a FastAPI route:

agent.py

import json, time
from fastapi import Request
from fastapi.responses import StreamingResponse
from openai import OpenAI

client = OpenAI()

# Mirrors the SDK's OpenAI -> AG-UI translation (openaiStream)
def openai_to_agui(stream):
    run_id = f"run-{int(time.time() * 1000)}"
    yield {"type": "RUN_STARTED", "runId": run_id}
    text_open = False
    tool_ids = {}  # tool_call index -> toolCallId
    for chunk in stream:
        choice = chunk.choices[0]
        delta = choice.delta
        for tc in (delta.tool_calls or []):
            if tc.id:  # first fragment of a tool call carries id + name
                tool_ids[tc.index] = tc.id
                yield {"type": "TOOL_CALL_START", "toolCallId": tc.id,
                       "toolCallName": tc.function.name}
            if tc.function and tc.function.arguments:
                yield {"type": "TOOL_CALL_ARGS", "toolCallId": tool_ids[tc.index],
                       "delta": tc.function.arguments}
        if delta.content:
            if not text_open:
                text_open = True
                yield {"type": "TEXT_MESSAGE_START", "role": "assistant"}
            yield {"type": "TEXT_MESSAGE_CONTENT", "delta": delta.content}
        if choice.finish_reason == "tool_calls":
            for tid in tool_ids.values():
                yield {"type": "TOOL_CALL_END", "toolCallId": tid}
        elif choice.finish_reason == "stop" and text_open:
            yield {"type": "TEXT_MESSAGE_END"}
            text_open = False
    yield {"type": "RUN_FINISHED", "runId": run_id}

async def agent(request: Request):
    params = (await request.json())["params"]  # nested, not spread
    messages = ([{"role": "system", "content": params["system"]}]
                if params.get("system") else []) + params["messages"]
    stream = client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
        tools=(params.get("tools") or {}).get("openai"),
        max_tokens=params.get("max_tokens", 4000),
        stream=True,
    )

    def sse():
        for event in openai_to_agui(stream):
            yield "data: " + json.dumps(event) + "\n\n"
        yield "data: [DONE]\n\n"

    return StreamingResponse(sse(), media_type="text/event-stream")

Reasoning models (OpenAI o-series / gpt-5*) reject max_tokens — send max_completion_tokens instead. See “Models & parameters” in the package README.

Fetch data before you render

In Fixed Screens mode your model is handed the top screen tools. If it calls one before it has the data, the screen renders empty. Make sure your agent fetches its data first, then renders. If your agent has its own data tools, run them in a first pass, then expose the screen tool in a second pass with tool_choice forcing the render. BranderUX also injects the current date and a “never fabricate” instruction to reduce empty or placeholder screens — see Fixed Screens vs Flexible.

Authenticating the endpoint

sseStream uses its own fetch and does not pass through your app's axios/HTTP interceptors, so attach auth explicitly via the headers option:

App.tsx

sseStream("/api/agent", {
  params,
  headers: { Authorization: `Bearer ${token}` },
});