GLM-5V-Turbo is Z.ai’s first native multimodal agent foundation model, built for vision-based coding and agent-driven tasks. It natively handles image, video, and text inputs, excels at long-horizon planning, complex coding, and task execution, and works seamlessly with agents to complete the full loop of “perceive → plan → execute“.
by Z-ai|203K context|$1.20/M input tokens|$4.00/M output tokens
Endpoints
Available providers for this model, with details on pricing, context limits, and real-time health metrics.