Alibaba Qwen model API pricing
Alibaba Qwen qwen3-vl-plus API pricing
Direct answer
As of June 27, 2026, Alibaba Qwen's qwen3-vl-plus lists at $2.46 per 1M input tokens and $7.37 per 1M output tokens in the ByteCosts pricing index (A confidence, source last checked 2026-06-27), with a 262K-token context window, which is a planning estimate you should verify against the provider source before billing, because output volume, retries, and prompt-cache hit rate move the real qwen3-vl-plus bill far more than the headline input rate.
Estimate qwen3-vl-plus monthly cost - Open the calculator with your traffic →
qwen3-vl-plus price card
List prices per million tokens from the ByteCosts pricing index, graded A confidence. Output tokens are the expensive side of almost every bill.
| Price field | Value |
|---|---|
| Input | $2.46 / 1M tokens |
| Output | $7.37 / 1M tokens |
| Cached input (read) | not published |
| Cache write | not published |
| Cache write (1h) | not published |
| Context window | 262K tokens |
Context window and output limits
What qwen3-vl-plus can hold in a single request, and where ByteCosts stops short of a source-backed claim.
- Context window: 262K tokens of combined input plus output per request.
- Maximum output tokens: not a field in this committed pricing record. ByteCosts only states source-backed numbers, so treat the provider documentation as authoritative for the per-response output cap.
- The large 262K window suits long-document, whole-repository, and multi-turn agent workloads, but long prompts multiply the input-token line, so cache the shared prefix where you can.
What it costs at real workloads
Each row is computed from this record's committed input and output token prices with the ByteCosts cost engine. Assumptions are shown inline.
| Workload | Assumed monthly tokens | Estimated token cost |
|---|---|---|
| Light reference | 1M input + 250K output tokens/month | $4.30 / month |
| Production reference | 10M input + 2.5M output tokens/month | $43.03 / month |
| Heavy reference | 100M input + 25M output tokens/month | $430 / month |
Best-fit workloads for qwen3-vl-plus
Where this price profile is cost-effective, and where it is not, derived from the committed input, output, cache, and context fields.
- Good fit: Balanced chat and tool-calling where prompts and completions are similar in size, since the $2.46 input and $7.37 output rates are close in scale.
- Good fit: Long-context tasks that need the 262K window in one request.
- Not ideal: Workloads that resend a large shared prefix on every call, because this record has no committed prompt-cache discount to amortize that prefix.
qwen3-vl-plus pricing across providers
qwen3-vl-plus is listed by 2 providers in the ByteCosts index. Per-million-token list prices, lowest input first.
| Provider | Input / 1M | Output / 1M |
|---|---|---|
| Alibaba (this page) | $2.46 | $7.37 |
| Alibaba (China) | $2.46 | $7.37 |
When qwen3-vl-plus is cost-effective
qwen3-vl-plus ranks 7 of 7 Alibaba records by combined list price ($9.83 for 1M input plus 1M output tokens). That puts it in the higher-priced third of this provider's committed records.
No source-backed cache-read discount is committed for this record, so the page does not assume cache savings.
The committed context window is 262K; use that as the only long-context signal on this page, not as a quality or latency claim.
Cheaper alternatives
Records with a lower combined input plus output list price than this one. Same-provider and same-model cross-provider records are preferred where available.
| Record | Input / 1M | Output / 1M | Combined / 1M + 1M | Detail page |
|---|---|---|---|---|
| Alibaba Qwen3.5-Flash | $0.100 | $0.210 | $0.310 | /tools/ai-provider-pricing/alibaba--qwen3.5-flash |
| Alibaba qwen-mt-lite | $0.120 | $0.360 | $0.480 | /tools/ai-provider-pricing/alibaba--qwen-mt-lite |
| Alibaba qwen-mt-flash | $0.160 | $0.490 | $0.650 | /tools/ai-provider-pricing/alibaba--qwen-mt-flash |
Evidence, freshness, and limitations
ByteCosts grades this Alibaba qwen3-vl-plus row A: From official provider docs (price stated indirectly). The record was first tracked on 2026-05-30 and last checked on 2026-06-27, and the source URL stays attached so you can verify the provider source before a purchasing or architecture decision.
Every number here is a planning estimate, not a quote. Real invoices can differ because taxes, negotiated and volume discounts, minimum commitments, batch and priority tiers, gateway and observability fees, and non-token charges are excluded unless they appear in the committed record. Verify the provider source and quote the last-checked date with any copied figure, because model prices change often.
Hidden costs to watch
Flags attached to this source-backed record. They do not change the list price but they change the effective bill.
- no prompt caching
Frequently asked questions
How much does Alibaba Qwen qwen3-vl-plus cost per 1M tokens?
qwen3-vl-plus lists at $2.46 per 1M input tokens and $7.37 per 1M output tokens in the ByteCosts index. Output is about 3.0x the input price, so completion length drives the bill. See the price card and cost examples above for current per-workload figures.
Is qwen3-vl-plus cheaper than the alternatives?
On combined list price, 3 records in the index come in lower than qwen3-vl-plus, listed in the cheaper-alternatives table. Whether one is actually cheaper for you depends on your token mix, cache hit rate, and quality bar, so compare on your real workload in the calculator.
How fresh is this qwen3-vl-plus price and where is it from?
The price comes from Alibaba Qwen's official source, normalized into the ByteCosts pricing index, graded A confidence, and last checked 2026-06-27. Prices are list prices and exclude negotiated discounts; verify the provider source before production billing.
Cite this page
Alibaba Qwen qwen3-vl-plus API pricing. ByteCosts. Updated 2026-06-27. https://bytecosts.com/pricing/alibaba/qwen3-vl-plus/