Chat Completions
The Chat Completions API is the standard endpoint for conversational AI interactions. It supports multi-turn conversations, system prompts, and all the features you'd expect from the OpenAI-compatible interface.
Endpoint
Endpoint
POST https://mume.ai/api/v1/chat/completionsBasic Request
cURL
Bash
curl https://mume.ai/api/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $MUME_API_KEY" \
-d '{
"model": "openai/gpt-4.1-mini",
"messages": [
{"role": "user", "content": "2+2="}
]
}'Python
Python
import openai
client = openai.OpenAI(
api_key="your-api-key",
base_url="https://mume.ai/api/v1",
)
response = client.chat.completions.create(
model="openai/gpt-4.1-mini",
messages=[{"role": "user", "content": "2+2="}],
)
print(response.choices[0].message.content)JavaScript
JavaScript
import OpenAI from "openai";
const client = new OpenAI({
apiKey: "your-api-key",
baseURL: "https://mume.ai/api/v1",
});
const response = await client.chat.completions.create({
model: "openai/gpt-4.1-mini",
messages: [{role: "user", content: "2+2="}],
});
console.log(response.choices[0].message.content);Streaming
Get responses token-by-token for a real-time experience. Set stream: true in your request.
Python
Python
response = client.chat.completions.create(
model="openai/gpt-4.1-mini",
messages=[{"role": "user", "content": "Write a poem about the moon."}],
stream=True,
)
for chunk in response:
content = getattr(chunk.choices[0].delta, "content", None)
if content is not None:
print(content, end="", flush=True)JavaScript
JavaScript
const stream = await client.chat.completions.create({
model: "openai/gpt-4.1-mini",
messages: [{role: "user", content: "Write a poem about the moon."}],
stream: true,
});
for await (const chunk of stream) {
const content = chunk.choices[0]?.delta?.content;
if (content) {
process.stdout.write(content);
}
}Request Parameters
| Parameter | Type | Description |
|---|---|---|
| model | string | Model ID in provider/model format. Required. |
| messages | array | Array of message objects with role and content. Required. |
| stream | boolean | Enable streaming responses. Default: false. |
| temperature | number | Sampling temperature (0–2). Higher = more creative. |
| max_tokens | number | Maximum tokens to generate. |
| tools | array | Function definitions for function calling. |