
Gemini fashions at the moment are out there in Batch Mode
Immediately, we’re excited to introduce a batch mode within the Gemini API, a brand new asynchronous endpoint designed particularly for high-throughput, non-latency-critical workloads. The Gemini API Batch Mode permits you to submit giant jobs, offload the scheduling and processing, and retrieve your outcomes inside 24 hours—all at a 50% low cost in comparison with our synchronous APIs.
Course of extra for much less
Batch Mode is the right device for any job the place you could have your information prepared upfront and don’t want a right away response. By separating these giant jobs out of your real-time site visitors, you unlock three key advantages:
- Value financial savings: Batch jobs are priced at 50% lower than the usual charge for a given mannequin
- Greater throughput: Batch Mode has even larger charge limits
- Simple API calls: No must handle complicated client-side queuing or retry logic. Out there outcomes are returned inside a 24-hour window.
A easy workflow for big jobs
We’ve designed the API to be easy and intuitive. You package deal all of your requests right into a single file, submit it, and retrieve your outcomes as soon as the job is full. Listed below are some methods builders are leveraging Batch Mode for duties at this time:
- Bulk content material era and processing: Specializing in deep video understanding, Reforged Labs makes use of Gemini 2.5 Professional to investigate and label huge portions of video adverts month-to-month. Implementing Batch Mode has revolutionized their operations by considerably chopping prices, accelerating shopper deliverables, and enabling the large scalability wanted for significant market insights.
Get began in only a few strains of code
You can begin utilizing Batch Mode at this time with the Google GenAI Python SDK:
# Create a JSONL that incorporates these strains:
# {"key": "request_1", "request": {"contents": [{"parts": [{"text": "Explain how AI works in a few words"}]}]}},
# {"key": "request_2", "request": {"contents": [{"parts": [{"text": "Explain how quantum computing works in a few words"}]}]}}
uploaded_batch_requests = shopper.information.add(file="batch_requests.json")
batch_job = shopper.batches.create(
mannequin="gemini-2.5-flash",
src=uploaded_batch_requests.identify,
config={
'display_name': "batch_job-1",
},
)
print(f"Created batch job: {batch_job.identify}")
# Anticipate as much as 24 hours
if batch_job.state.identify == 'JOB_STATE_SUCCEEDED':
result_file_name = batch_job.dest.file_name
file_content_bytes = shopper.information.obtain(file=result_file_name)
file_content = file_content_bytes.decode('utf-8')
for line in file_content.splitlines():
print(line)
Python
To be taught extra, try the official documentation and pricing pages.
We’re rolling out Batch Mode for the Gemini API at this time and tomorrow to all customers. That is simply the beginning for batch processing, and we’re actively engaged on increasing its capabilities. Keep tuned for extra highly effective and versatile choices!