Chat Completions
Chat Completions
The core endpoint for conversational AI. Send a list of messages, get a reply. Supports streaming, vision, function calling, and every leading LLM provider.
POST
https://api.belugapi.com/v1/chat/completions
Requires Authorization: Bearer bapi_… header.
Request parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | required | Model ID slug — see table below. |
| messages | array | required | Array of {role, content} objects. Roles: system, user, assistant. |
| stream | boolean | optional | If true, tokens stream as SSE events. Default: false. |
| max_tokens | integer | optional | Maximum tokens to generate in the response. |
| temperature | number | optional | Sampling temperature 0–2. Higher = more creative. Default: 1. |
| top_p | number | optional | Nucleus sampling probability. Default: 1. |
| tools | array | optional | Function definitions for function calling (tool use). |
| tool_choice | string | object | optional | "auto", "none", or {"type":"function","function":{"name":"…"}}. |
| response_format | object | optional | {"type":"json_object"} or {"type":"json_schema",…} for structured output. |
| stop | string | array | optional | Up to 4 sequences where generation stops. |
| frequency_penalty | number | optional | –2.0 to 2.0. Penalises repeated tokens. |
| presence_penalty | number | optional | –2.0 to 2.0. Penalises tokens already in the prompt. |
| seed | integer | optional | For deterministic outputs (model dependent). |
Basic example
from openai import OpenAI client = OpenAI( api_key="bapi_your_key_here", base_url="https://api.belugapi.com/v1" ) response = client.chat.completions.create( model="gpt-5.4", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Explain quantum entanglement in 3 sentences."} ], max_tokens=256, temperature=0.7, ) print(response.choices[0].message.content)
import OpenAI from "openai"; const client = new OpenAI({ apiKey: "bapi_your_key_here", baseURL: "https://api.belugapi.com/v1" }); const res = await client.chat.completions.create({ model: "gpt-5.4", messages: [ { role: "system", content: "You are a helpful assistant." }, { role: "user", content: "Explain quantum entanglement in 3 sentences." }, ], max_tokens: 256, temperature: 0.7, }); console.log(res.choices[0].message.content);
curl https://api.belugapi.com/v1/chat/completions \ -H "Authorization: Bearer bapi_your_key_here" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-5.4", "messages": [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Explain quantum entanglement in 3 sentences."} ], "max_tokens": 256, "temperature": 0.7 }'
Response object
200 OK
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1716900000,
"model": "gpt-5.4",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "Quantum entanglement is a phenomenon…"
},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 28,
"completion_tokens": 74,
"total_tokens": 102
}
}
Streaming (SSE)
Set "stream": true. Tokens arrive as data: {...} chunks followed by data: [DONE].
stream = client.chat.completions.create( model="claude-opus-4-7", messages=[{"role": "user", "content": "Write a haiku about the sea."}], stream=True, ) for chunk in stream: delta = chunk.choices[0].delta.content if delta: print(delta, end="", flush=True)
const stream = await client.chat.completions.create({ model: "claude-opus-4-7", messages: [{ role: "user", content: "Write a haiku about the sea." }], stream: true, }); for await (const chunk of stream) { process.stdout.write(chunk.choices[0]?.delta?.content ?? ""); }
curl https://api.belugapi.com/v1/chat/completions \ -H "Authorization: Bearer bapi_your_key_here" \ -H "Content-Type: application/json" \ --no-buffer \ -d '{ "model": "claude-opus-4-7", "messages": [{"role":"user","content":"Write a haiku about the sea."}], "stream": true }'
Vision (image input)
For models with vision support, pass an array of content items including image_url parts.
response = client.chat.completions.create( model="gpt-5.4", messages=[{ "role": "user", "content": [ {"type": "text", "text": "What is in this image?"}, {"type": "image_url", "image_url": {"url": "https://example.com/image.jpg"}} ] }] )
curl https://api.belugapi.com/v1/chat/completions \ -H "Authorization: Bearer bapi_your_key_here" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-5.4", "messages": [{ "role": "user", "content": [ {"type": "text", "text": "What is in this image?"}, {"type": "image_url", "image_url": {"url": "https://example.com/image.jpg"}} ] }] }'
Function calling (tools)
response = client.chat.completions.create( model="gpt-5.4", messages=[{"role": "user", "content": "What is the weather in Paris?"}], tools=[{ "type": "function", "function": { "name": "get_weather", "description": "Get current weather for a city.", "parameters": { "type": "object", "properties": { "city": {"type": "string", "description": "City name"} }, "required": ["city"] } } }], tool_choice="auto", )
curl https://api.belugapi.com/v1/chat/completions \ -H "Authorization: Bearer bapi_your_key_here" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-5.4", "messages": [{"role":"user","content":"What is the weather in Paris?"}], "tools": [{ "type": "function", "function": { "name": "get_weather", "description": "Get current weather for a city.", "parameters": { "type": "object", "properties": { "city": {"type":"string"} }, "required": ["city"] } } }], "tool_choice": "auto" }'
Available LLM models
Use any of these IDs in the model field. All models are OpenAI-compatible.
| Model ID | Name | Provider | Type | Stream | Vision |
|---|---|---|---|---|---|
gpt-5.2-codex |
GPT-5.2 Codex | OpenAI | code | ✓ | — |
gpt-5.2-pro |
GPT-5.2 Pro | OpenAI | chat | ✓ | ✓ |
gpt-5.2-chat-latest |
GPT-5.2 Chat Latest | OpenAI | chat | ✓ | ✓ |
gpt-5.2 |
GPT-5.2 | OpenAI | chat | ✓ | ✓ |
gpt-5.1-chat-latest |
GPT-5.1 Chat Latest | OpenAI | chat | ✓ | ✓ |
gpt-5.1 |
GPT-5.1 | OpenAI | chat | ✓ | ✓ |
gpt-5.1-2025-11-13 |
GPT-5.1 (2025-11-13) | OpenAI | chat | ✓ | ✓ |
gpt-5.1-codex |
GPT-5.1 Codex | OpenAI | code | ✓ | — |
gpt-5.1-codex-mini |
GPT-5.1 Codex Mini | OpenAI | code | ✓ | — |
gpt-5-pro-2025-10-06 |
GPT-5 Pro | OpenAI | chat | ✓ | ✓ |
gpt-5-2025-08-07 |
GPT-5 | OpenAI | chat | ✓ | ✓ |
gpt-5-mini-2025-08-07 |
GPT-5 Mini | OpenAI | chat | ✓ | ✓ |
gpt-5-nano-2025-08-07 |
GPT-5 Nano | OpenAI | chat | ✓ | — |
gpt-5-search-api |
GPT-5 Search API | OpenAI | chat | ✓ | — |
gpt-5-search-api-2025-10-14 |
GPT-5 Search API (2025-10-14) | OpenAI | chat | ✓ | — |
gpt-4o-transcribe |
GPT-4o Transcribe | OpenAI | audio | ✓ | — |
gpt-4o-mini-transcribe |
GPT-4o Mini Transcribe | OpenAI | audio | ✓ | — |
gpt-4.1-mini-2025-04-14 |
GPT-4.1 Mini | OpenAI | chat | ✓ | ✓ |
gpt-4.1-nano-2025-04-14 |
GPT-4.1 Nano | OpenAI | chat | ✓ | — |
gpt-4-1106-preview |
GPT-4 Turbo (1106 Preview) | OpenAI | chat | ✓ | ✓ |
gpt-3.5-turbo-16k |
GPT-3.5 Turbo 16K | OpenAI | chat | ✓ | — |
o3-2025-04-16 |
o3 | OpenAI | reasoning | ✓ | ✓ |
o4-mini-2025-04-16 |
o4 Mini | OpenAI | reasoning | ✓ | ✓ |
o3-mini-2025-01-31 |
o3 Mini | OpenAI | reasoning | ✓ | — |
o1-2024-12-17 |
o1 | OpenAI | reasoning | ✓ | ✓ |
o1-mini-2024-09-12 |
o1 Mini | OpenAI | reasoning | ✓ | — |
claude-opus-4-7 |
Claude Opus 4.7 | Anthropic | chat | ✓ | ✓ |
claude-opus-4-6 |
Claude Opus 4.6 | Anthropic | chat | ✓ | ✓ |
claude-opus-4-6-thinking |
Claude Opus 4.6 Thinking | Anthropic | reasoning | ✓ | ✓ |
claude-opus-4-5-20251101 |
Claude Opus 4.5 | Anthropic | chat | ✓ | ✓ |
claude-opus-4-5-20251101-thinking |
Claude Opus 4.5 Thinking | Anthropic | reasoning | ✓ | ✓ |
claude-sonnet-4-6 |
Claude Sonnet 4.6 | Anthropic | chat | ✓ | ✓ |
claude-sonnet-4-6-thinking |
Claude Sonnet 4.6 Thinking | Anthropic | reasoning | ✓ | ✓ |
claude-sonnet-4-5-20250929 |
Claude Sonnet 4.5 | Anthropic | chat | ✓ | ✓ |
claude-sonnet-4-5-20250929-thinking |
Claude Sonnet 4.5 Thinking | Anthropic | reasoning | ✓ | ✓ |
claude-haiku-4-5-20251001 |
Claude Haiku 4.5 | Anthropic | chat | ✓ | ✓ |
claude-haiku-4-5-20251001-thinking |
Claude Haiku 4.5 Thinking | Anthropic | reasoning | ✓ | ✓ |
claude-3-7-sonnet-20250219-thinking |
Claude 3.7 Sonnet Thinking | Anthropic | reasoning | ✓ | ✓ |
gemini-3.1-pro-preview |
Gemini 3.1 Pro Preview | chat | ✓ | ✓ | |
gemini-3-pro-preview |
Gemini 3 Pro Preview | chat | ✓ | ✓ | |
gemini-3-pro-preview-thinking |
Gemini 3 Pro Preview Thinking | reasoning | ✓ | ✓ | |
gemini-3-flash-preview |
Gemini 3 Flash Preview | chat | ✓ | ✓ | |
gemini-3-flash-preview-nothinking |
Gemini 3 Flash Preview (No Thinking) | chat | ✓ | ✓ | |
gemini-2.5-pro |
Gemini 2.5 Pro | chat | ✓ | ✓ | |
gemini-2.5-pro-thinking |
Gemini 2.5 Pro Thinking | reasoning | ✓ | ✓ | |
gemini-2.5-pro-nothinking |
Gemini 2.5 Pro (No Thinking) | chat | ✓ | ✓ | |
gemini-2.5-flash |
Gemini 2.5 Flash | chat | ✓ | ✓ | |
gemini-2.5-flash-thinking |
Gemini 2.5 Flash Thinking | reasoning | ✓ | ✓ | |
gemini-2.5-flash-nothinking |
Gemini 2.5 Flash (No Thinking) | chat | ✓ | ✓ | |
gemini-2.5-flash-lite |
Gemini 2.5 Flash Lite | chat | ✓ | ✓ | |
gemini-2.0-flash |
Gemini 2.0 Flash | chat | ✓ | ✓ | |
deepseek-v3.2 |
DeepSeek V3.2 | DeepSeek | chat | ✓ | — |
deepseek-v3.2-exp |
DeepSeek V3.2 Experimental | DeepSeek | chat | ✓ | — |
deepseek-v3.1-terminus |
DeepSeek V3.1 Terminus | DeepSeek | chat | ✓ | — |
deepseek-v3-0324 |
DeepSeek V3 | DeepSeek | chat | ✓ | — |
deepseek-r1-250528 |
DeepSeek R1 | DeepSeek | reasoning | ✓ | — |
deepseek-r1-0528 |
DeepSeek R1 (0528) | DeepSeek | reasoning | ✓ | — |
deepseek-ocr |
DeepSeek OCR | DeepSeek | vision | ✓ | ✓ |
glm-5.1 |
GLM-5.1 | Zhipu | chat | ✓ | ✓ |
glm-4.7 |
GLM-4.7 | Zhipu | chat | ✓ | ✓ |
glm-4.6 |
GLM-4.6 | Zhipu | chat | ✓ | ✓ |
minimax-m2.1 |
MiniMax M2.1 | MiniMax | chat | ✓ | ✓ |
kimi-k2-instruct |
Kimi K2 Instruct | Moonshot AI | chat | ✓ | ✓ |
kimi-k2-thinking |
Kimi K2 Thinking | Moonshot AI | reasoning | ✓ | ✓ |
llama3.1-8b |
Meta Llama 3.1 8B | BelugAPI | chat | ✓ | — |
gpt-oss-120b |
OpenAI GPT OSS 120B | BelugAPI | chat | ✓ | — |
qwen-3-235b-a22b-instruct-2507 |
Qwen 3 235B Instruct | BelugAPI | chat | ✓ | — |
zai-glm-4.7 |
Z.ai GLM 4.7 | BelugAPI | chat | ✓ | — |
Examples by model
Claude (Anthropic)
response = client.chat.completions.create( model="claude-opus-4-7", messages=[{"role": "user", "content": "Write a Python function to reverse a linked list."}], max_tokens=1024, )
curl https://api.belugapi.com/v1/chat/completions \ -H "Authorization: Bearer bapi_your_key_here" \ -H "Content-Type: application/json" \ -d '{ "model": "claude-opus-4-7", "messages": [{"role":"user","content":"Write a Python function to reverse a linked list."}], "max_tokens": 1024 }'
Gemini (Google)
response = client.chat.completions.create( model="gemini-3-flash-preview", # fast & cheap messages=[{"role": "user", "content": "Summarise the French Revolution in 5 points."}], )
curl https://api.belugapi.com/v1/chat/completions \ -H "Authorization: Bearer bapi_your_key_here" \ -H "Content-Type: application/json" \ -d '{ "model": "gemini-3-flash-preview", "messages": [{"role":"user","content":"Summarise the French Revolution in 5 points."}] }'
DeepSeek
# DeepSeek R1 — reasoning model response = client.chat.completions.create( model="deepseek-r1-250528", messages=[{"role": "user", "content": "Solve: if 2x+3=11, what is x?"}], )
curl https://api.belugapi.com/v1/chat/completions \ -H "Authorization: Bearer bapi_your_key_here" \ -H "Content-Type: application/json" \ -d '{ "model": "deepseek-r1-250528", "messages": [{"role":"user","content":"Solve: if 2x+3=11, what is x?"}] }'
BelugAPI Edge (high-throughput)
Edge models (
llama3.1-8b, gpt-oss-120b, qwen-3-235b-a22b-instruct-2507) run on BelugAPI's own infrastructure for ultra-low latency and up to 3,000 tokens/s throughput.# Llama 3.1 8B — fastest model, ~2200 tok/s response = client.chat.completions.create( model="llama3.1-8b", messages=[{"role": "user", "content": "Hello! What can you do?"}], stream=True, )
curl https://api.belugapi.com/v1/chat/completions \ -H "Authorization: Bearer bapi_your_key_here" \ -H "Content-Type: application/json" \ -d '{ "model": "llama3.1-8b", "messages": [{"role":"user","content":"Hello! What can you do?"}], "stream": true }'