AI Economics

MiniMax Prices M3 Cache Reads at $0.06 per 1M Tokens, Half Its Standard Rate

Last updated 2026-06-09 · ByteCosts

MiniMax Prices M3 Cache Reads at $0.06 per 1M Tokens, Half Its Standard Rate explains MiniMax-M3 reads cached input at $0.06 per 1M tokens, which its pricing page lists as a permanent half-off discount on the standard $0.12 cache-read rate. Here is why the cache-read line is the one to check for cache-heavy workloads. This ByteCosts research article explains the cost mechanics behind the headline, turns the pattern into budgeting questions, and points readers toward calculators that can model the same issue with their own workload. Read it when you need a finance-readable explanation of AI Economics before choosing a model, cloud platform, subscription, or optimization path. The static HTML includes the summary, article body, tables, related tools, and citation before JavaScript runs.

Apply this concept - LLM API Pricing Index: Compare Model Costs by Provider →

Summary

MiniMax prices cached input reads on its M3 model at $0.06 per 1M tokens, which its pricing page lists as a permanent halfoff discount on a standard cacheread rate of $0.12. For a workload that reuses a long context across many calls, that line is worth checking against the input rate.

MiniMax reads cached input tokens for M3 at $0.06 per 1M tokens, half its standard cacheread rate of $0.12 per 1M tokens. The model's input price is $0.30 per 1M tokens and its output price is $1.20 per 1M tokens. For a loop that reuses context, the cacheread rate is often the line that matters most, so the discounted rate is worth modeling against the input price rather than read off the headline number.

Cached reads bill at $0.06 per 1M tokens, against the $0.12 standard rate, the $0.30 input rate, and the $1.20 output rate. For a loop that resends the same instructions, schema, and history on every step, cached reads can make up a large share of the token bill, so the cacheread rate can matter more than the headline input line. Drop the rates into the ByteCosts AI cost calculator with your own contextreuse ratio to see which one your workload actually hits.

MiniMaxM3 reads cached input at $0.06 per 1M tokens, half its standard $0.12 rate, a discount its pricing page lists as permanent. The model's input price is $0.30 and its output price is $1.20 per 1M tokens. For cacheheavy loops, compare models on the cacheread rate, not just the headline input line.

Article body

Quick answer

What to watch

Key takeaways

What this article covers

Quick answer
What to watch
Key takeaways

Use it with ByteCosts calculators

After reading the research note, open the related calculator and replace the example assumptions with your own users, requests, tokens, seats, or platform usage.

The goal is to convert the article's cost pattern into a concrete monthly run-rate, per-user margin, or break-even point your team can discuss.

Frequently asked questions

Is this article available before JavaScript runs?

Yes. The prerendered HTML includes the article summary, direct answer, key sections, related tools, and citation block for crawlers and readers without JavaScript.

Can I model the article's scenario with my own assumptions?

Yes. Use the related ByteCosts calculators to replace the article's example numbers with your own workload, usage, and pricing assumptions.

MiniMax Prices M3 Cache Reads at $0.06 per 1M Tokens, Half Its Standard Rate. ByteCosts. Updated 2026-06-09. https://bytecosts.com/blog/minimax-m3-cache-read-cut/

Sources

MiniMax payasyougo pricing

Machine-readable

Markdown mirror