Mume AI

Mercury

Mercury is the first diffusion large language model (dLLM). Applying a breakthrough discrete diffusion approach, the model runs 5-10x faster than even speed optimized models like GPT-4.1 Nano and Claude 3.5 Haiku while matching their performance. Mercury's speed enables developers to provide responsive user experiences, including with voice agents, search interfaces, and chatbots. Read more in the blog post here.

by Inception|128K context|$0.25/M input tokens|$1.00/M output tokens

Endpoints

Available providers for this model, with details on pricing, context limits, and real-time health metrics.

No explicit endpoints reported for this model.