Skip to content

chore(pricing): Update vertex-ai pricing#550

Open
siddharthsambharia-portkey wants to merge 36 commits intomainfrom
pricing-update/vertex-ai
Open

chore(pricing): Update vertex-ai pricing#550
siddharthsambharia-portkey wants to merge 36 commits intomainfrom
pricing-update/vertex-ai

Conversation

@siddharthsambharia-portkey
Copy link
Copy Markdown
Collaborator

@siddharthsambharia-portkey siddharthsambharia-portkey commented Mar 17, 2026

🔄 Pricing Update: vertex-ai

📊 Summary (complete_diff mode)

Change Type Count
➕ Models added 3
🔄 Models updated (merged) 11

➕ New Models

  • gemini-2.5-pro-tts
  • gemini-2.5-flash-tts
  • claude-sonnet-4-6@default

🔄 Updated Models

  • gemini-2.5-pro
  • gemini-3-flash-preview
  • gemini-3.1-pro-preview
  • gemini-3.1-flash-lite-preview
  • gemini-3.1-flash-image-preview
  • gemini-3-pro-image-preview
  • veo-3.0-fast-generate-preview
  • veo-3.1-fast-generate-001
  • text-embedding-005
  • text-multilingual-embedding-002
  • multimodalembedding

Model-to-Pricing-Page Mapping

Model ID Publisher / Section Source Notes
gemini-2.5-pro Google – Gemini 2.5 Pro API Standard (≤200K) input price used; tiered pricing >200K exists
gemini-2.5-flash Google – Gemini 2.5 Flash API Standard input price
gemini-2.5-flash-lite Google – Gemini 2.5 Flash-Lite API
gemini-2.5-computer-use-preview-10-2025 Google – Gemini 2.5 Pro Computer Use API Maps to Gemini 2.5 Pro Computer Use pricing row
gemini-2.5-flash-preview-09-2025 Google – Gemini 2.5 Flash API Preview alias; matched to Gemini 2.5 Flash row
gemini-2.5-flash-lite-preview-09-2025 Google – Gemini 2.5 Flash-Lite API Preview alias; matched to Flash-Lite row
gemini-2.5-flash-image Google – Gemini 2.5 Flash Image API Token pricing + image_token additional_unit
gemini-2.5-pro-tts Google – Gemini 2.5 Pro API – price not found TTS model; no dedicated pricing row found
gemini-2.5-flash-tts Google – Gemini 2.5 Flash API – price not found TTS model; no dedicated pricing row found
gemini-2.0-flash-001 Google – Gemini 2.0 Flash API
gemini-2.0-flash-lite-001 Google – Gemini 2.0 Flash-Lite API
gemini-3-pro-preview Google – Gemini 3 Pro API Preview; matched to Gemini 3 Pro row
gemini-3-flash-preview Google – Gemini 3 Flash API Preview; matched to Gemini 3 Flash row
gemini-3.1-pro-preview Google – Gemini 3.1 Pro API Preview; matched to Gemini 3.1 Pro Preview row
gemini-3.1-flash-lite-preview Google – Gemini 3.1 Flash-Lite API Preview; matched to Gemini 3.1 Flash-Lite row
gemini-3.1-flash-image-preview Google – Gemini 3.1 Flash Image API Preview; matched to Gemini 3.1 Flash Image row; image_token additional
gemini-3-pro-image-preview Google – Gemini 3 Pro Image API Preview; matched to Gemini 3 Pro Image row; image_token additional
imagen-3.0-generate-002 Google – Imagen 3 API Per-image pricing $0.04/image
imagen-4.0-generate-001 Google – Imagen 4 API Per-image pricing $0.04/image
imagen-4.0-fast-generate-001 Google – Imagen 4 Fast API Per-image pricing $0.02/image
imagen-4.0-ultra-generate-001 Google – Imagen 4 Ultra API Per-image pricing $0.06/image
imagen-3.0-capability-001 Google – Imagen 3 (capability) API Capability model; uses equivalent Imagen 3 generate price $0.04/image
imagen-3.0-capability-002 Google – Imagen 3 (capability) API Capability model; uses equivalent Imagen 3 generate price $0.04/image
veo-2.0-generate-001 Google – Veo 2 API Video-only $0.50/s; duration 8s, sample 1
veo-3.0-generate-001 Google – Veo 3 API Video-only 720p/1080p $0.20/s; duration 8s, sample 1
veo-3.0-fast-generate-001 Google – Veo 3 Fast API Video-only $0.10/s; duration 8s, sample 1
veo-3.0-generate-preview Google – Veo 3 API Preview; same pricing as Veo 3
veo-3.0-fast-generate-preview Google – Veo 3 Fast API Preview; same pricing as Veo 3 Fast
veo-3.1-generate-001 Google – Veo 3.1 API Video-only 720p/1080p $0.20/s; duration 8s, sample 1
veo-3.1-fast-generate-001 Google – Veo 3.1 Fast API Video-only $0.10/s; duration 8s, sample 1
veo-3.1-generate-preview Google – Veo 3.1 API Preview; same pricing as Veo 3.1
veo-3.1-fast-generate-preview Google – Veo 3.1 Fast API Preview; same pricing as Veo 3.1 Fast
textembedding-gecko Google – Text Embedding (legacy) API – price not found Legacy embedding; no pricing row found
text-embedding-005 Google – Text Embedding API $/1K chars ($0.000025/1K chars)
text-multilingual-embedding-002 Google – Text Multilingual Embedding API $/1K chars ($0.000025/1K chars)
text-embedding-large-exp-03-07 Google – Text Embedding Large (Exp) API – price not found Experimental; no dedicated pricing row
gemini-embedding-001 Google – Gemini Embedding API $0.00015/1K tokens online
gemini-embedding-2-preview Google – Gemini Embedding 2 API – price not found Preview; no dedicated pricing row
multimodalembedding Google – Multimodal Embedding API text $0.0002/1K chars; image $0.0001/img; video plus/standard/essential per-second
claude-opus-4-6 Anthropic – Claude API API returns claude-opus-4-6@default; @default stripped
claude-sonnet-4-6 Anthropic – Claude API API returns claude-sonnet-4-6@default; @default stripped
claude-opus-4-5@20251101 Anthropic – Claude API Pinned @Date kept; matches Claude Opus 4.5 pricing row
claude-haiku-4-5@20251001 Anthropic – Claude API Pinned @Date kept; matches Claude Haiku 4.5 pricing row
claude-sonnet-4-5@20250929 Anthropic – Claude API Pinned @Date kept; matches Claude Sonnet 4.5 pricing row
claude-opus-4-1@20250805 Anthropic – Claude API Pinned @Date kept; matches Claude Opus 4.1 pricing row
claude-sonnet-4@20250514 Anthropic – Claude API Pinned @Date kept; matches Claude Sonnet 4 pricing row
claude-opus-4@20250514 Anthropic – Claude API Pinned @Date kept; matches Claude Opus 4 pricing row (no cache listed)
gpt-oss-120b-maas OpenAI – GPT OSS API Matches gpt-oss-120b row; $0.09/$0.36 with batch
llama-3.3-70b-instruct-maas Meta – Llama API Matches Llama 3.3 70B row; $0.72/$0.72 with batch
llama-4-maverick-17b-128e-instruct-maas Meta – Llama 4 Maverick API Matches Llama 4 Maverick row; $0.35/$1.15 with batch
mistral-small-2503 Mistral AI – Small API Matches Mistral Small 3.1 (25.03) row; $0.10/$0.30
mistral-medium-3 Mistral AI – Medium API Matches Mistral Medium 3 row; $0.40/$2.00
codestral-2 Mistral AI – Codestral API Matches Codestral 2 row; $0.30/$0.90
deepseek-r1-0528-maas DeepSeek – R1 API Matches DeepSeek-R1 (0528) row; $1.35/$5.40 with batch
deepseek-v3.1-maas DeepSeek – V3.1 API Matches DeepSeek-V3.1 row; $0.60/$1.70 with cache and batch
deepseek-v3.2-maas DeepSeek – V3.2 API Matches DeepSeek-V3.2 row; $0.56/$1.68 with cache and batch
qwen3-235b-a22b-instruct-2507-maas Qwen – Qwen3 235B API Matches Qwen3-235B-A22B-Instruct-2507 row; $0.22/$0.88 with batch
qwen3-coder-480b-a35b-instruct-maas Qwen – Qwen3 Coder API Matches Qwen3-Coder-480B row; $0.22/$1.80 with cache and batch
qwen3-next-80b-a3b-instruct-maas Qwen – Qwen3 Next 80B Instruct API Matches Qwen3-Next-80B-Instruct row; $0.15/$1.20
qwen3-next-80b-a3b-thinking-maas Qwen – Qwen3 Next 80B Thinking API Matches Qwen3-Next-80B-Thinking row; $0.15/$1.20
minimax-m2-maas MiniMax – M2 API Matches MiniMax-M2 row; $0.30/$1.20 with cache
kimi-k2-thinking-maas MoonshotAI – Kimi K2 API Matches Kimi-K2-Thinking row; $0.60/$2.50 with cache
glm-4.7-maas ZAI.org – GLM API Matches GLM-4.7 row; $0.60/$2.20
glm-5-maas ZAI.org – GLM API Matches GLM-5 row; $1.00/$3.20 with cache

Excluded Models (not added)

Model Publisher Reason
imagegeneration Google Legacy; superseded by Imagen 3+
virtual-try-on-001 Google Product-specific retail model
gemini-live-2.5-flash-native-audio Google *-live-* streaming model
lyria-002, lyria-3-pro-preview, lyria-3-clip-preview Google Music generation; no inference endpoint
All gemma*, paligemma, codegemma, etc. Google Self-deploy only (no -maas suffix)
All non-generative CV/NLP models (efficientnet, BERT, ResNet, etc.) Google Non-generative ML
pretrained-ocr Google OCR model
chirp-2, chirp-3 Google Audio models — not generative text inference
translate-llm, text-translation Google Non-generative translation
video-text-detection, video-speech-transcription Google Non-generative video/audio analysis
weathernext, weather-next-v2 Google Forecasting model — not generative AI inference
shieldgemma2 Google Safety/guard model
earth-ai-imagery-* Google Specialized Earth observation models
clip-vit-base-patch32, openclip OpenAI Non-generative (CLIP embedding)
whisper-large OpenAI Audio transcription (not generative)
gpt-oss OpenAI Self-deploy only (no -maas)
segment-anything, sam3, faster-r-cnn, retinanet, mask-r-cnn Meta Non-generative CV (segmentation/detection)
xlm-roberta-large, roberta-large Meta Non-generative NLP
imagebind Meta Multimodal embedding (self-deploy)
nllb Meta Non-generative translation
llama-guard, prompt-guard Meta Guard/safety models
llama2, llama3, llama3_1, llama3-2, llama3-3, llama4, llama-2-quantized, codellama-7b-hf Meta Self-deploy only (no -maas)
mistral, mixtral Mistral-AI (self-deploy publisher) Self-deploy only
codestral-2501-self-deploy Mistralai Self-deploy only
mistral-ocr-2505 Mistralai OCR model
ministral-3, mistral-large-3 Mistralai Self-deploy only
deepseek-r1, deepseek-v3, deepseek-v3-1, deepseek-v3-2 DeepSeek-AI Self-deploy only
deepseek-ocr, deepseek-ocr-2, deepseek-ocr-maas DeepSeek-AI OCR models
qwq, qwen3, qwen3-5, qwen2, qwen3-coder, qwen3-coder-next, qwen3-next, qwen3-embedding, qwen3-vl Qwen Self-deploy only
qwen-image Qwen Excluded by policy (image generation)
jamba-large-1.6 AI21 Self-deploy only
kimi-k2, kimi-k2-5 MoonshotAI Self-deploy only
minimax-m2 MiniMax Self-deploy only
glm-4.7, glm-5, glm-4.5 ZAI.org Self-deploy only
glm-ocr ZAI.org OCR model
glm-image ZAI.org Excluded by policy (image generation)

Generated by Pricing Agent on 2026-03-31

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant