System instruction and implicit caching question

komin · November 4, 2025, 10:18am

Hey everyone,

I’m building a product using the Gemini API, and I’m really hoping to leverage implicit caching to reduce the (very) high API costs. However, there’s not much detailed documentation about how it actually works, so I wanted to ask here in case anyone knows.

Specifically — does the system instruction (the part that’s fixed at the beginning of the prompt) count as part of what’s being cached implicitly? Or is it treated separately and excluded from implicit caching?

Any clarification would be super appreciated. Thanks!

Mrinal_Ghosh · November 10, 2025, 6:50am

Hi @komin ,

Implicit caching is enabled by default for all Gemini 2.5 models The system instruction counts as part of the cached prefix,
Please refer to- https://ai.google.dev/gemini-api/docs/caching?lang=node#implicit-caching.

Let me know if you have any further questions.

komin · November 11, 2025, 8:20am

Hi, thanks for your answer. Does Gemini 2.5 Pro require at least 4,096 tokens or 2,048 tokens for implicit caching to work? I’ve seen some documents mentioning 2,048 and others mentioning 4,096. Also, are there any troubleshooting steps I can take if I’ve already met all the requirements but implicit caching still doesn’t seem to activate? That seems to be my case.
I might have to use explicit caching if there are no troubleshooting available.
Thanks a lot.

Topic		Replies	Views
Implicit Caching not Working on Gemini 2.5 Pro Gemini API gemini-2-5 , context_caching	3	320	June 16, 2025
Implicit Caching: Gemini 2.5 Pro Preview 05-06 Gemini API context_caching , gemini_25_pro	3	330	June 25, 2025
Expert opinion on System Instruction Gemini API	2	379	May 22, 2024
What is the character limit that I can load in system instructions? Gemini API gemini-15	9	1035	May 9, 2024
System Instructions Gemini API api	10	819	May 4, 2024

System instruction and implicit caching question

Related topics