Z.AI model API pricing

Z.AI GLM-5-Turbo API pricing

Last updated 2026-06-27 · ByteCosts

Direct answer

As of June 27, 2026, Z.AI's GLM-5-Turbo lists at $1.20 per 1M input tokens and $4.00 per 1M output tokens in the ByteCosts pricing index (A confidence, source last checked 2026-06-27), with no context window stated in this record, which is a planning estimate you should verify against the provider source before billing, because output volume, retries, and prompt-cache hit rate move the real GLM-5-Turbo bill far more than the headline input rate.

Estimate GLM-5-Turbo monthly cost - Open the calculator with your traffic →

GLM-5-Turbo price card

List prices per million tokens from the ByteCosts pricing index, graded A confidence. Output tokens are the expensive side of almost every bill.

Z.AI GLM-5-Turbo per-million-token prices
Price field	Value
Input	$1.20 / 1M tokens
Output	$4.00 / 1M tokens
Cached input (read)	$0.240 / 1M tokens
Cache write	not published
Cache write (1h)	not published
Context window	not in this record

Context window and output limits

What GLM-5-Turbo can hold in a single request, and where ByteCosts stops short of a source-backed claim.

Context window: not part of this committed pricing record; check the provider page for the current limit.
Maximum output tokens: not a field in this committed pricing record. ByteCosts only states source-backed numbers, so treat the provider documentation as authoritative for the per-response output cap.

What it costs at real workloads

Each row is computed from this record's committed input and output token prices with the ByteCosts cost engine. Assumptions are shown inline.

Z.AI GLM-5-Turbo monthly cost examples
Workload	Assumed monthly tokens	Estimated token cost
Light reference	1M input + 250K output tokens/month	$2.20 / month
Production reference	10M input + 2.5M output tokens/month	$22.00 / month
Heavy reference	100M input + 25M output tokens/month	$220 / month

Best-fit workloads for GLM-5-Turbo

Where this price profile is cost-effective, and where it is not, derived from the committed input, output, cache, and context fields.

Good fit: Repeated-prefix agents and chat with a stable system prompt: cached input reads list at $0.240 per 1M, well below the $1.20 fresh-input rate.
Good fit: Balanced chat and tool-calling where prompts and completions are similar in size, since the $1.20 input and $4.00 output rates are close in scale.
Not ideal: High-volume long-form generation: output is about 3.3x the input price ($4.00 vs $1.20 per 1M), so verbose or reasoning-heavy responses dominate the invoice.

GLM-5-Turbo pricing across providers

GLM-5-Turbo is listed by 2 providers in the ByteCosts index. Per-million-token list prices, lowest input first.

GLM-5-Turbo input and output prices by provider
Provider	Input / 1M	Output / 1M
Z.AI (this page)	$1.20	$4.00
Zhipu AI	$1.20	$4.00

When GLM-5-Turbo is cost-effective

GLM-5-Turbo ranks 15 of 19 Z.AI records by combined list price ($5.20 for 1M input plus 1M output tokens). That puts it in the higher-priced third of this provider's committed records.

Cached input reads are listed at $0.240 per 1M tokens, 20% of the fresh input rate, so repeated cached prompts have a lower committed input-token cost than fresh prompts.

No context window is committed for this record, so there is no long-context cost signal to apply.

Cheaper alternatives

Records with a lower combined input plus output list price than this one. Same-provider and same-model cross-provider records are preferred where available.

Z.AI GLM-5-Turbo lower-priced alternatives
Record	Input / 1M	Output / 1M	Combined / 1M + 1M	Detail page
Z.AI GLM-4.7-Flash	$0	$0	$0	/tools/ai-provider-pricing/zai--glm-4.7-flash
Z.AI GLM-4.5-Flash	$0	$0	$0	/tools/ai-provider-pricing/zai--glm-4.5-flash
Z.AI GLM-4.6V-Flash	$0	$0	$0	/tools/ai-provider-pricing/zai--glm-4.6v-flash

Evidence, freshness, and limitations

ByteCosts grades this Z.AI GLM-5-Turbo row A: From official provider docs (price stated indirectly). The record was first tracked on 2026-05-30 and last checked on 2026-06-27, and the source URL stays attached so you can verify the provider source before a purchasing or architecture decision.

Every number here is a planning estimate, not a quote. Real invoices can differ because taxes, negotiated and volume discounts, minimum commitments, batch and priority tiers, gateway and observability fees, and non-token charges are excluded unless they appear in the committed record. Verify the provider source and quote the last-checked date with any copied figure, because model prices change often.

Hidden costs to watch

Flags attached to this source-backed record. They do not change the list price but they change the effective bill.

No hidden-cost flags are attached to this source-backed record, but always budget for output-token blow-up, retries, and cache misses.

Frequently asked questions

How much does Z.AI GLM-5-Turbo cost per 1M tokens?

GLM-5-Turbo lists at $1.20 per 1M input tokens and $4.00 per 1M output tokens in the ByteCosts index, with cached input reads at $0.240 per 1M. Output is about 3.3x the input price, so completion length drives the bill. See the price card and cost examples above for current per-workload figures.

Is GLM-5-Turbo cheaper than the alternatives?

On combined list price, 3 records in the index come in lower than GLM-5-Turbo, listed in the cheaper-alternatives table. Whether one is actually cheaper for you depends on your token mix, cache hit rate, and quality bar, so compare on your real workload in the calculator.

How fresh is this GLM-5-Turbo price and where is it from?

The price comes from Z.AI's official source, normalized into the ByteCosts pricing index, graded A confidence, and last checked 2026-06-27. Prices are list prices and exclude negotiated discounts; verify the provider source before production billing.

Cite this page

Z.AI GLM-5-Turbo API pricing. ByteCosts. Updated 2026-06-27. https://bytecosts.com/pricing/zai/glm-5-turbo/