Gemini Embedding API: Encountering "Model operations request limit per minute for a region" 429 Error - RPM Limit Confusion

yz_xiaolu · September 12, 2025, 10:51am

Hello Developers,

I’m currently encountering 429 errors when using the Gemini-Embedding-001 model. The specific error message is as follows:

{'error': {'code': 429, 'message': "Quota exceeded for quota metric 'Read API requests' and limit 'Model operations request limit per minute for a region' of service 'generativelanguage.googleapis.com' for consumer 'project_number:327036061011'.", 'status': 'RESOURCE_EXHAUSTED', 'details': [{'@type': 'type.googleapis.com/google.rpc.ErrorInfo', 'reason': 'RATE_LIMIT_EXCEEDED', 'domain': 'googleapis.com', 'metadata': {'quota_location': 'us-south1', 'quota_limit_value': '200', 'quota_unit': '1/min/{project}/{region}', 'service': 'generativelanguage.googleapis.com', 'consumer': 'projects/327036061011', 'quota_metric': 'generativelanguage.googleapis.com/model_requests', 'quota_limit': 'ModelRequestsPerMinutePerProjectPerRegion'}}, {'@type': 'type.googleapis.com/google.rpc.Help', 'links': [{'description': 'Request a higher quota limit.', 'url': 'https://cloud.google.com/docs/quotas/help/request_increase'}]

I’ve noticed that while my Gemini API account is a Tier 1 paid account, and the documentation suggests an RPM (Requests Per Minute) of 3000 for Embedding models, I’m actually hitting a limit of “Model operations request limit per minute for a region” with a value of 200 RPM.

This is quite confusing. If this regional RPM limit of 200 is in place, how can we achieve the true 3000 RPM rate for Gemini Embedding models to support large-scale applications?

I’ve also tried upgrading to a Tier 2 account, but I’m still encountering the same 429 error.

Has anyone else experienced similar issues? Or is there an official explanation on how to overcome this regional RPM limit when using Embedding models at scale, to fully leverage the higher limits of paid accounts?

Any advice or guidance would be greatly appreciated!

yz_xiaolu · September 12, 2025, 12:12pm

1462×1696 211 KB

yz_xiaolu · September 12, 2025, 12:13pm

01136×1350 60.2 KB

chunduriv · September 12, 2025, 11:14pm

Hi @yz_xiaolu,

Welcome to the Forum,

Thank you for bringing this to our attention. We appreciate you flagging this issue and will report it to the internal team.

Thank you!

goran-ranksy · October 3, 2025, 3:55pm

Hey @chunduriv do you have any updates about this issue? I’ve keep hitting the error 429 on tier 1 plan and I’ve sent only 4 batch requests today.

Can you at least publish the rate limits so we can know and plan accordingly?

Topic		Replies	Views
Issue with 429 Error on Gemini API Despite Staying Within Rate Limits Gemini API gemini-api	7	824	June 23, 2025
Gemini-1.5-pro-002 quotas lower than 001 Gemini API gemini-15 , vertexai	7	1401	November 19, 2024
429 Errors on Large Prompt Gemini API	8	436	August 4, 2024
10 RPM quota being applied to Paid tier (gemini-2.0-flash-lite-preview-02-05) Google AI Studio api , models	6	465	February 18, 2025
* Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_requests, limit: 50 Gemini API api , models	7	765	September 24, 2025

Gemini Embedding API: Encountering "Model operations request limit per minute for a region" 429 Error - RPM Limit Confusion

Related topics