Errors & Limits
Errors & Rate Limits
BelugAPI follows the OpenAI error format exactly. All errors return a JSON object with an error key, even for HTTP 5xx responses when possible.
Error object
Error response structure
{
"error": {
"message": "The model 'foo' does not exist.",
"type": "invalid_request_error",
"code": "model_not_found",
"param": "model"
}
}
| Field | Type | Description |
|---|---|---|
| message | string | Human-readable description of the error. |
| type | string | Error category. See table below. |
| code | string | null | Machine-readable error code. May be null. |
| param | string | null | The request parameter that caused the error, if applicable. |
HTTP status codes
| Status | Meaning | Common cause |
|---|---|---|
| 200 | OK | Request succeeded. |
| 400 | Bad Request | Missing or invalid parameter (e.g. no model, bad messages). |
| 401 | Unauthorized | Missing or invalid API key. |
| 402 | Payment Required | Insufficient balance. Top up at dashboard. |
| 403 | Forbidden | API key does not have access to this endpoint or model. |
| 404 | Not Found | Model does not exist, or task ID not found. |
| 429 | Too Many Requests | Rate limit exceeded. Back off and retry. |
| 500 | Internal Server Error | Unexpected error on BelugAPI side. Retry with exponential backoff. |
| 502 | Bad Gateway | Upstream provider returned an error or was unreachable. |
| 503 | Service Unavailable | Model catalog unavailable, or model temporarily down. |
| 504 | Gateway Timeout | Upstream provider timed out (e.g. image / video task polling timeout). |
Error types
| type | Description |
|---|---|
| invalid_request_error | Request is malformed — missing params, wrong model, wrong endpoint. |
| authentication_error | API key is missing, revoked, or invalid format. |
| insufficient_quota | Account balance is too low to process the request. |
| rate_limit_error | Too many requests in a short period. |
| upstream_error | The underlying AI provider returned an error. |
| server_error | Internal BelugAPI error (catalog, database, etc.). |
| timeout | Operation timed out (image polling 5 min, upstream 10 min). |
Error codes
| code | HTTP | Description |
|---|---|---|
| missing_parameter | 400 | A required field was not provided. |
| invalid_parameter | 400 | A field value is not valid. |
| model_not_found | 404 | The model ID does not exist in the catalog. |
| wrong_endpoint | 400 | Model used on the wrong endpoint (e.g. video model on /chat/completions). |
| insufficient_balance | 402 | Account balance is below minimum threshold. |
| endpoint_not_allowed | 403 | API key restricted from this endpoint. |
| model_not_allowed | 403 | API key has an allow-list that doesn't include this model. |
| upstream_not_configured | 503 | Upstream provider credentials are missing (edge models only). |
| image_task_timeout | 504 | Image task polling exceeded 5-minute deadline. |
| catalog_missing | 503 | Model catalog JSON file is missing or unreadable. |
Handling errors in code
import openai try: response = client.chat.completions.create( model="gpt-5.4", messages=[{"role": "user", "content": "Hello!"}], ) except openai.AuthenticationError as e: print("Invalid API key:", e.message) except openai.RateLimitError as e: print("Rate limited — back off:", e.message) except openai.InsufficientQuotaError as e: print("Low balance — top up at belugapi.com/dashboard") except openai.BadRequestError as e: print("Bad request:", e.message, "param:", e.param) except openai.APIStatusError as e: print("Unexpected error:", e.status_code, e.message)
import OpenAI from "openai"; try { const response = await client.chat.completions.create({...}); } catch (err) { if (err instanceof OpenAI.AuthenticationError) { console.error("Invalid API key"); } else if (err instanceof OpenAI.RateLimitError) { console.error("Rate limited", err.message); } else if (err instanceof OpenAI.BadRequestError) { console.error("Bad request:", err.error?.error?.code); } else { console.error("API error:", err.status, err.message); } }
Retry strategy
Implement exponential backoff for transient errors (429, 502, 503, 504).
import time, openai def call_with_retry(fn, max_retries=4): delay = 1 for attempt in range(max_retries): try: return fn() except (openai.RateLimitError, openai.APIStatusError) as e: if attempt == max_retries - 1: raise print(f"Retrying in {delay}s… ({e})") time.sleep(delay) delay *= 2
Errors in streaming responses
When stream: true, the HTTP status code is always 200 after headers are sent. Errors mid-stream arrive as a special SSE event followed by [DONE]:
Mid-stream error (SSE)
data: {"error":{"message":"Upstream error.","type":"upstream_error","code":"upstream_error"}}
data: [DONE]
Always listen for
error fields in SSE chunks, not just HTTP status codes, when streaming is enabled.Rate limits
BelugAPI enforces rate limits per API key and per workspace to ensure fair usage. Limits vary by plan.
| Limit type | Default | Notes |
|---|---|---|
| Requests per minute (RPM) | 60 | Higher on Pro and Enterprise plans. |
| Tokens per minute (TPM) | 200,000 | LLM models only. |
| Concurrent video tasks | 5 | Maximum parallel in-flight video generation tasks. |
| Audio file size | 25 MB | Hard limit for transcription uploads. |
| TTS input length | 4096 chars | Per request. Split longer texts into chunks. |
When you hit a rate limit (HTTP 429), wait for the
Retry-After header value (in seconds) before retrying, or use exponential backoff starting at 1 second.Health check
Use GET /v1/health to verify API availability. No authentication required.
curl https://api.belugapi.com/v1/health
import requests r = requests.get("https://api.belugapi.com/v1/health") print(r.json())
Health response
{
"status": "ok",
"service": "belugapi-gateway",
"version": "1.0.0",
"time": "2026-05-05T15:00:00+00:00",
"checks": {
"catalog": { "ok": true, "total_models": 240 },
"database": { "ok": true }
}
}