Chat Completions

The Chat Completions API is the standard endpoint for conversational AI interactions. It supports multi-turn conversations, system prompts, and all the features you'd expect from the OpenAI-compatible interface.

Endpoint

POST https://mume.ai/api/v1/chat/completions

Basic Request

cURL

Bash

curl https://mume.ai/api/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $MUME_API_KEY" \
  -d '{
    "model": "openai/gpt-4.1-mini",
    "messages": [
      {"role": "user", "content": "2+2="}
    ]
  }'

Python

import openai

client = openai.OpenAI(
    api_key="your-api-key",
    base_url="https://mume.ai/api/v1",
)

response = client.chat.completions.create(
    model="openai/gpt-4.1-mini",
    messages=[{"role": "user", "content": "2+2="}],
)

print(response.choices[0].message.content)

JavaScript

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "your-api-key",
  baseURL: "https://mume.ai/api/v1",
});

const response = await client.chat.completions.create({
  model: "openai/gpt-4.1-mini",
  messages: [{role: "user", content: "2+2="}],
});

console.log(response.choices[0].message.content);

Streaming

Get responses token-by-token for a real-time experience. Set stream: true in your request.

Python

response = client.chat.completions.create(
    model="openai/gpt-4.1-mini",
    messages=[{"role": "user", "content": "Write a poem about the moon."}],
    stream=True,
)

for chunk in response:
    content = getattr(chunk.choices[0].delta, "content", None)
    if content is not None:
        print(content, end="", flush=True)

JavaScript

const stream = await client.chat.completions.create({
  model: "openai/gpt-4.1-mini",
  messages: [{role: "user", content: "Write a poem about the moon."}],
  stream: true,
});

for await (const chunk of stream) {
  const content = chunk.choices[0]?.delta?.content;
  if (content) {
    process.stdout.write(content);
  }
}

Request Parameters

Parameter	Type	Description
model	string	Model ID in `provider/model` format. Required.
messages	array	Array of message objects with `role` and `content`. Required.
stream	boolean	Enable streaming responses. Default: false.
temperature	number	Sampling temperature (0–2). Higher = more creative.
max_tokens	number	Maximum tokens to generate.
tools	array	Function definitions for function calling.

Next: Responses API →

Basic Request

cURL

Bash

curl https://mume.ai/api/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $MUME_API_KEY" \
  -d '{
    "model": "openai/gpt-4.1-mini",
    "messages": [
      {"role": "user", "content": "2+2="}
    ]
  }'

Python

import openai

client = openai.OpenAI(
    api_key="your-api-key",
    base_url="https://mume.ai/api/v1",
)

response = client.chat.completions.create(
    model="openai/gpt-4.1-mini",
    messages=[{"role": "user", "content": "2+2="}],
)

print(response.choices[0].message.content)

JavaScript

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "your-api-key",
  baseURL: "https://mume.ai/api/v1",
});

const response = await client.chat.completions.create({
  model: "openai/gpt-4.1-mini",
  messages: [{role: "user", content: "2+2="}],
});

console.log(response.choices[0].message.content);

Streaming

Get responses token-by-token for a real-time experience. Set stream: true in your request.

Python

response = client.chat.completions.create(
    model="openai/gpt-4.1-mini",
    messages=[{"role": "user", "content": "Write a poem about the moon."}],
    stream=True,
)

for chunk in response:
    content = getattr(chunk.choices[0].delta, "content", None)
    if content is not None:
        print(content, end="", flush=True)

JavaScript

const stream = await client.chat.completions.create({
  model: "openai/gpt-4.1-mini",
  messages: [{role: "user", content: "Write a poem about the moon."}],
  stream: true,
});

for await (const chunk of stream) {
  const content = chunk.choices[0]?.delta?.content;
  if (content) {
    process.stdout.write(content);
  }
}

Request Parameters

Parameter	Type	Description
model	string	Model ID in `provider/model` format. Required.
messages	array	Array of message objects with `role` and `content`. Required.
stream	boolean	Enable streaming responses. Default: false.
temperature	number	Sampling temperature (0–2). Higher = more creative.
max_tokens	number	Maximum tokens to generate.
tools	array	Function definitions for function calling.