OpenAI API Server¶

The OpenAI API server provides an OpenAI-compatible API that makes your AgentPool agents accessible through the standard OpenAI API format. This enables integration with any tool or library that supports the OpenAI API.

Overview¶

The server implements the OpenAI API specification, exposing agents as models:

GET  /v1/models              -> List available agents
POST /v1/chat/completions    -> Chat completions (streaming supported)
POST /v1/responses           -> Responses API

Quick Start¶

# Run with default settings
agentpool serve-api config.yml

# Custom host and port
agentpool serve-api config.yml --host 0.0.0.0 --port 8000

See serve-api for all CLI options.

Programmatic Usage¶

import anyio
from agentpool import AgentPool
from agentpool_server.openai_api_server import OpenAIAPIServer


async def main():
    pool = AgentPool()
    await pool.add_agent(Agent("gpt-4-custom", model="openai:gpt-4"))

    server = OpenAIAPIServer(
        pool,
        host="0.0.0.0",
        port=8000,
        api_key="your-secret-key",  # Optional authentication
        cors=True,
        docs=True,
    )

    async with server, server.run_context():
        await anyio.sleep_forever()


anyio.run(main)

API Endpoints¶

List Models¶

GET /v1/models

Returns available agents as OpenAI-compatible models:

{
  "object": "list",
  "data": [
    {
      "id": "assistant",
      "object": "model",
      "created": 0,
      "owned_by": "agentpool"
    },
    {
      "id": "coder",
      "object": "model",
      "created": 0,
      "owned_by": "agentpool"
    }
  ]
}

Chat Completions¶

POST /v1/chat/completions

Standard OpenAI chat completions format:

{
  "model": "assistant",
  "messages": [
    {"role": "user", "content": "Hello!"}
  ],
  "stream": false
}

Response:

{
  "id": "msg_123",
  "object": "chat.completion",
  "created": 1234567890,
  "model": "assistant",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 15,
    "total_tokens": 25
  }
}

Streaming¶

Set "stream": true for server-sent events:

data: {"id":"msg_123","choices":[{"delta":{"content":"Hello"}}]}
data: {"id":"msg_123","choices":[{"delta":{"content":"!"}}]}
data: [DONE]

Responses API¶

POST /v1/responses

Alternative responses API format:

{
  "model": "assistant",
  "input": "Tell me a story about a robot."
}

Authentication¶

The server supports optional API key authentication:

server = OpenAIAPIServer(
    pool,
    api_key="your-secret-key"
)

Clients must include the key in the Authorization header:

curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Authorization: Bearer your-secret-key" \
  -H "Content-Type: application/json" \
  -d '{"model": "assistant", "messages": [{"role": "user", "content": "Hi"}]}'

Client Integration¶

OpenAI Python Client¶

Use the official OpenAI Python client with your AgentPool server:

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="your-secret-key"
)

response = client.chat.completions.create(
    model="assistant",  # Your agent name
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

Streaming with OpenAI Client¶

stream = client.chat.completions.create(
    model="assistant",
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

LangChain Integration¶

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    base_url="http://localhost:8000/v1",
    api_key="your-secret-key",
    model="assistant"
)

response = llm.invoke("What is the meaning of life?")

curl Examples¶

# List models
curl http://localhost:8000/v1/models \
  -H "Authorization: Bearer your-secret-key"

# Chat completion
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Authorization: Bearer your-secret-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "assistant",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

# Streaming
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Authorization: Bearer your-secret-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "assistant",
    "messages": [{"role": "user", "content": "Hello!"}],
    "stream": true
  }'

Configuration¶

Agent Configuration¶

# config.yml
agents:
  # Agents become models with their names as IDs
  gpt-4-custom:
    type: chat
    model: openai:gpt-4
    system_prompt: "You are a helpful assistant."

  claude-coder:
    type: claude_code
    model: anthropic:claude-sonnet-4-20250514
    tools:
      - type: file_access
      - type: process_management

Agents are accessible as models: gpt-4-custom, claude-coder, etc.

API Documentation¶

When docs=True (default), the server provides interactive API documentation:

Swagger UI: http://localhost:8000/docs
ReDoc: http://localhost:8000/redoc

Use Cases¶

Drop-in Replacement¶

Replace OpenAI API calls with your own agents without changing client code:

# Before: Using OpenAI directly
client = OpenAI()

# After: Using your AgentPool server
client = OpenAI(base_url="http://localhost:8000/v1", api_key="key")

# Same API, your agents
response = client.chat.completions.create(
    model="my-agent",
    messages=[{"role": "user", "content": "Hello"}]
)

Local Development¶

Run agents locally for development without API costs:

agents:
  dev-assistant:
    model: ollama:llama3
    system_prompt: "You are a development assistant."

API Gateway¶

Expose multiple backend models through a unified API:

agents:
  fast:
    model: openai:gpt-4o-mini
  smart:
    model: anthropic:claude-sonnet-4-20250514
  local:
    model: ollama:llama3

Clients choose the appropriate model for their use case.