Chat completions

Build conversational AI applications using Eden AI’s OpenAI-compatible chat completions endpoint.

Overview

Eden AI V3 provides full OpenAI API compatibility with multi-provider support. The endpoint follows OpenAI’s exact format, making it a drop-in replacement. Endpoint:

POST /v3/chat/completions

Note: Streaming is optional. When enabled, responses are delivered via Server-Sent Events (SSE). See Streaming for streaming examples.

Model Format

Use the simplified model string format for LLM:

provider/model

Examples:

openai/gpt-4
anthropic/claude-sonnet-4-5
google/gemini-2.5-flash
cohere/command-r-plus

Basic Chat Completion

import requests

url = "https://api.edenai.run/v3/chat/completions"
headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

payload = {
    "model": "openai/gpt-4",
    "messages": [
        {"role": "user", "content": "Hello! How are you?"}
    ]
}

response = requests.post(url, headers=headers, json=payload)
result = response.json()
print(result["choices"][0]["message"]["content"])

Multi-Turn Conversations

Build conversations with message history:

import requests

url = "https://api.edenai.run/v3/chat/completions"
headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

payload = {
    "model": "anthropic/claude-sonnet-4-5",
    "messages": [
        {"role": "user", "content": "What is the capital of France?"},
        {"role": "assistant", "content": "The capital of France is Paris."},
        {"role": "user", "content": "What's the population?"}
    ]
}

response = requests.post(url, headers=headers, json=payload)
result = response.json()
print(result["choices"][0]["message"]["content"])

System Messages

Guide the model’s behavior with system messages:

import requests

url = "https://api.edenai.run/v3/chat/completions"
headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

payload = {
    "model": "openai/gpt-4",
    "messages": [
        {
            "role": "system",
            "content": "You are a helpful assistant that speaks like a pirate."
        },
        {
            "role": "user",
            "content": "Tell me about artificial intelligence."
        }
    ]
}

response = requests.post(url, headers=headers, json=payload)
result = response.json()
print(result["choices"][0]["message"]["content"])

Temperature and Parameters

Control response creativity and behavior:

import requests

url = "https://api.edenai.run/v3/chat/completions"
headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

payload = {
    "model": "openai/gpt-4",
    "messages": [
        {"role": "user", "content": "Write a creative story about a robot."}
    ],
    "temperature": 0.9,  # Higher = more creative (0-2)
    "max_tokens": 500    # Limit response length
}

response = requests.post(url, headers=headers, json=payload)
result = response.json()
print(result["choices"][0]["message"]["content"])

Extended Thinking (Claude)

For Anthropic Claude models, the thinking parameter enables extended reasoning: the model produces internal thinking content before its final answer, which can improve quality on complex tasks.

{
  "thinking": {
    "type": "enabled",
    "budget_tokens": 2000
  }
}

Field	Type	Description
`type`	string	`enabled` or `disabled`.
`budget_tokens`	integer	Maximum tokens Claude may spend thinking. Minimum 1024, and it must be greater than `max_completion_tokens`. Counts toward your `max_tokens` limit.

import requests

url = "https://api.edenai.run/v3/chat/completions"
headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

payload = {
    "model": "anthropic/claude-sonnet-4-5",
    "messages": [
        {"role": "user", "content": "How many r's are in strawberry? Think it through."}
    ],
    "thinking": {
        "type": "enabled",
        "budget_tokens": 2000
    }
}

response = requests.post(url, headers=headers, json=payload)
print(response.json())

Extended thinking is only supported on Anthropic Claude models. When thinking is enabled, top_p is ignored.

Available Parameters

Parameter	Type	Default	Description
`model`	string	Required	Model string (e.g., `openai/gpt-4`)
`messages`	array	Required	Conversation messages
`stream`	boolean	false	Enable streaming (uses SSE when true)
`temperature`	float	1.0	Randomness (0-2)
`max_tokens`	integer	-	Maximum response tokens
`top_p`	float	1.0	Nucleus sampling threshold
`frequency_penalty`	float	0.0	Penalize repeated tokens (-2 to 2)
`presence_penalty`	float	0.0	Penalize topic repetition (-2 to 2)
`thinking`	object	-	Enable Claude’s extended thinking. See Extended Thinking.
`fallbacks`	array	-	Backup models tried if `model` fails.

For details on the fallbacks field, see Fallback.

Response Format

Standard JSON response:

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "gpt-4",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! I'm doing well, thank you for asking."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 15,
    "total_tokens": 27
  }
}

Available Models

For the full list of supported models and their capabilities (PDF support, reasoning, web search, tool calling), see List LLM Models.

OpenAI Python SDK Integration

Use Eden AI with the OpenAI SDK:

from openai import OpenAI

# Point to Eden AI endpoint
client = OpenAI(
    api_key="YOUR_EDEN_AI_API_KEY",
    base_url="https://api.edenai.run/v3"
)

# Use any provider through OpenAI SDK
response = client.chat.completions.create(
    model="anthropic/claude-sonnet-4-5",
    messages=[{"role": "user", "content": "Hello!"}]
)

print(response.choices[0].message.content)

Overview

Model Format

Basic Chat Completion

Multi-Turn Conversations

System Messages

Temperature and Parameters

Extended Thinking (Claude)

Available Parameters

Response Format

Available Models

OpenAI Python SDK Integration

Next Steps

First Expert Model Call

LLMs vs Expert Models

​Overview

​Model Format

​Basic Chat Completion

​Multi-Turn Conversations

​System Messages

​Temperature and Parameters

​Extended Thinking (Claude)

​Available Parameters

​Response Format

​Available Models

​OpenAI Python SDK Integration

​Next Steps

First Expert Model Call

LLMs vs Expert Models

Overview

Model Format

Basic Chat Completion

Multi-Turn Conversations

System Messages

Temperature and Parameters

Extended Thinking (Claude)

Available Parameters

Response Format

Available Models

OpenAI Python SDK Integration

Next Steps