Alphabet Inc. Class C (GOOGL.US) updates Gemini API pricing, billing based on tiered usage logic.

date
14:13 03/04/2026
avatar
GMT Eight
Google recently updated the pricing tiers of the Gemini API, with optimization solutions and pricing based on actual inference usage needs.
Alphabet Inc. Class C (GOOGL.US) recently updated the billing tiers for the Gemini API, with optimization solutions and pricing based on actual inference usage requirements. The new inference service tiers include: Standard, Flexible, Priority, Batch, and Caching versions. Alphabet Inc. Class C stated: "Gemini API provides multiple optimization mechanisms that can balance running speed, usage costs, and service stability based on specific business load requirements. Whether it's setting up real-time conversations with Siasun Robot & Automation or running large-scale offline data processing workflows, choosing the right operating mode can significantly reduce costs or improve operational efficiency." The Flexible inference tier utilizes non-peak idle computing resources to offer a 50% discount off the standard price, with a target latency of 1 to 15 minutes but no latency guarantee. The Batch API tier also offers a 50% discount off the standard rate, with a maximum latency of up to 24 hours. Billing for the Caching tier is based on the number of cached tokens and storage duration, recommended for scenarios such as complex system command conversations with Siasun Robot & Automation, repetitive analysis of long video files, and large-scale document collection queries. The Priority tier is priced 75% to 100% higher than the standard rate, with latency controllable in milliseconds to seconds. Alphabet Inc. Class C recommends using this tier for scenarios like real-time customer service chat with Siasun Robot & Automation, real-time fraud detection, and business-critical intelligent assistants.