OpenAI prompt caching (AI Gateway)

Two-step demo: warm a stable prefix, then reuse it via promptCacheKey.

1
Step 1 — Warm cache
Write the stable 1,024+ token system prefix to the cache.
2
Step 2 — Reuse cache
Send the same prefix again; expect cached prompt tokens.

This agent runs twice with the same long system prompt. Step 1 stores that prefix in OpenAI's prompt cache; Step 2 reuses it so fewer input tokens are recomputed and latency drops.

Step 1 asks

Summarize the refund policy for annual plans in two short bullets.

Step 2 asks

What is the enterprise support SLA for severity-1 incidents? One sentence.