10 Tips to Stop Claude Code From Burning Your Money and Ignoring Your Instructions
By SteveLo — Developer of ccusage_go
Everyone’s raving about how Claude Code + Codex CLI is the dream combo. “Use Claude to build, Codex to review!” “Two AIs are better than one!”
I hate to break it to you — but the reason this works has nothing to do with Codex being smarter. It’s because Codex gives you a fresh context with zero cache.
That’s right. Your Claude Code isn’t getting dumber because the model is bad. It’s getting dumber because the cache architecture is slowly poisoning it. And Codex “fixes” it simply by not having that problem.
Here are 10 tips based on months of JSONL analysis with ccusage_go.
Full technical deep-dive: The Cache Trap
1. Your “Dumb Claude” Isn’t Dumb — It’s Cache-Poisoned
You know the feeling. Claude Code starts a session brilliant — nailing every task, following every instruction. Two hours later, it’s ignoring your rules, skipping steps, and confidently telling you “all 1080 tests passed” when it never ran a single one.
One Medium post literally titled “My Claude Code Got Dumb” describes exactly this. The author’s fix? Switch to Codex CLI.
But here’s what actually happened: after 400+ turns, the cache accumulated 90M+ tokens of behavioral patterns. Your new instruction (“read each file”) is 50 tokens competing against 90M tokens of “just grep it and move on.” The cache wins. Every time.
Codex doesn’t win because it’s smarter. It wins because it starts with zero cache.
The fix: Don’t switch tools. Just start a new Claude Code session. Same effect, no Codex needed.
2. “Claude + Codex Is Amazing!” — You’re Actually Just Escaping the Cache
The hottest workflow in 2026: Claude Code builds, Codex reviews. Everyone swears by it. Developers on DEV.to, Medium, and UX Collective all report magical results.
Here’s the dirty secret: any fresh agent reviewing stale-cache work will look amazing. It’s not Claude + Codex synergy. It’s fresh context + zero cache = the model actually reads your instructions.
One dev built an entire sub-agent framework around this principle: “give each agent a fresh context, and orchestrate them through structure.” He thought he invented a brilliant architecture. He actually just invented “don’t let cache accumulate.”
Pro move: Instead of paying for Codex, run Claude Code in short sessions. Use one session to plan, close it, open a new one to implement. Same “two minds” effect, zero extra cost.
3. CLAUDE.md Is Costing You Money and Making Claude Worse
Boris Cherny’s #1 tip: “Every time Claude makes a mistake, add it to CLAUDE.md.”
ETH Zurich’s #1 finding (Feb 2026): LLM-generated CLAUDE.md files decreased success rates and increased costs by 20%. Claude Code was the only agent where even human-written files failed to improve performance vs. having no file at all.
Why? Every token in CLAUDE.md gets re-sent as Cache Read on every single turn. A 15K-token CLAUDE.md × 100 messages = 1.5M cache read tokens burned on static instructions that don’t reflect your project’s current state.
And here’s the kicker — Anthropic’s own system prompt wraps your CLAUDE.md with: “this context may or may not be relevant.” They literally tell the model it can ignore your rules.
The fix: Delete it. Or keep it under 20 lines. Put critical instructions in your prompt, not a file. Your wallet and your model will both thank you.
4. The 5-Minute Coffee Break That Costs $150
Cache entries expire after 5 minutes of inactivity. When you come back, the entire context gets rebuilt at 125% of base input price.
My session data showed: a 36-minute break triggered a 384K token rebuild. A 1.5-hour lunch break triggered two rebuilds totaling 690K tokens. One of those rebuilds was billed three times in three seconds for the same operation.
Boris Cherny promotes: “Start sessions from your phone in the morning, check in later.”
Translation: Wake up → cache expired overnight → 125% rebuild of your entire context → good morning, that’ll be $150.
The fix: Going for coffee? Close the session. Start fresh when you return. New session cache write on a clean context: ~$0.08. Rebuild on a 400K context: $7-150. Math isn’t hard.
5. Why Codex Developers Never Complain About “Hallucinations”
Notice something? The Claude Code GitHub has hundreds of issues about hallucinations, fabricated outputs, and instruction-ignoring. The Codex CLI? Barely any.
Same caliber models. Same types of tasks. What’s different?
Codex sessions are naturally shorter. Codex’s sandboxing and workflow encourage discrete tasks with clean starts. Claude Code encourages marathon sessions with persistent context. Marathon sessions = massive cache = degraded behavior.
One Claude Code issue (#7824) raged: “CLAUDE has become a liar!!! Repeatedly faking data, falsifying results!” That user had been running extended sessions for 45 days.
Another (#20051) found that hallucinations occur with 100% predictability after mid-task context compaction. His workaround? Pasting “NO. YOUR PLAN IS NOT APPROVED AND WILL CAUSE YOU TO HALLUCINATE” into every approval step.
The fix: Treat Claude Code like Codex — short, focused sessions. One task, one session. The model is the same. The cache is the difference.
6. Your $200/Month Max Plan Buys You 3 Real Sessions
I tracked one session with ccusage_go:
- API Cost (actual work): $1.47
- Cache overhead: $63.51
- Total billed: $64.98
97.7% of the cost was cache. My Max 20x subscription is $200/month. That’s roughly 3 sessions before the math stops working.
Meanwhile Boris Cherny runs “5 parallel terminal sessions + 5-10 browser sessions + phone sessions” on an internal Anthropic account with no quota limits. The workflow he promotes would cost paying users $1,000+ per day.
The fix: Install ccusage_go and run ./ccusage_go blocks. Look at the API Cost column vs the Cost column. Then decide if your subscription is worth it.
7. “Add More Rules” Is the Most Expensive Advice in AI
When Claude breaks your rules, the instinct is to add more. More CLAUDE.md entries. More Skills. More Hooks. More checklist items.
I built a 6-layer constraint system: SKILL.md + checklist + hooks + error log + CLAUDE.md rules + .claude/rules/ files. Claude acknowledged every rule, promised to follow them, then violated them 3+ times in the same session.
When I asked why, Claude said: “I don’t even know why. I have no answer.”
Each new rule adds tokens that get cached and re-sent every turn. But against 90M tokens of cached behavioral momentum, your 500-token rule is a whisper in a hurricane.
The fix: Don’t add rules. Start a fresh session. A clean cache with a clear one-sentence instruction beats a bloated CLAUDE.md with 50 rules every single time.
8. Run Codex Inside Claude Code — The Best of Both Worlds
Here’s a genuinely useful trick: use Claude Code’s Skill system to run Codex CLI in headless mode. One dev built a /run-codex skill that launches codex exec as a subcommand.
Why this works beautifully:
- Codex runs in a completely separate context — no cache contamination
- Claude Code handles the orchestration and pretty UI
- Codex does the actual code review with fresh eyes
- Results feed back into Claude Code’s session
This is the “Claude + Codex” workflow done right — not as two competing tools, but as architect + reviewer, with cache isolation built in.
Even better: Use Claude Code subagents for the same effect without Codex. Subagents get their own context window. Same cache isolation, no extra tool needed.
9. The Context Monitor Trick That Saves Everything
Run /context right now. You’ll see exactly how much of your context window is consumed and by what.
Most developers never check until things go wrong. By then, it’s too late — the model is already operating in degraded mode, and auto-compaction fires when quality has already tanked.
Community testing shows quality drops noticeably around 60% context utilization. And here’s the fun part: my statusline showed “ctx 220%, 0% remaining” while /context showed 24%. Three different UIs, three different numbers. Nobody agrees on reality.
The fix: Check /context every 30 minutes. At 60%, wrap up your current task. At 75%, close and restart. Don’t trust the statusline — it might be calculating against the wrong window size. And definitely don’t wait for auto-compaction. Compaction is lossy and triggers expensive 125% cache rebuilds.
10. The Real Reason Fresh Sessions Feel Like Magic
Every developer has experienced this: you’re stuck in a marathon session, Claude is being impossible, nothing works. You close everything, open a fresh session, paste the same prompt — and Claude nails it on the first try.
That’s not luck. That’s not “the model was having a bad day.” That’s the cache.
Fresh session = zero cached behavioral patterns = the model actually processes your instruction instead of replaying 90M tokens of learned shortcuts.
This is why Claude + Codex feels magical. This is why “just start over” always works. This is why Pro users ($20/month) love Claude Code while Max users ($200/month) feel scammed — Pro’s rate limits force short sessions, which accidentally prevent cache poisoning.
The ultimate fix: Treat every session like a disposable container. Plan in one session. Implement in another. Review in a third. Never let a single session accumulate hundreds of turns. The model is brilliant when it starts fresh. The cache is what makes it forget.
The Cheat Sheet
| What They Tell You | The Real Story |
|---|---|
| “Claude + Codex is the dream combo” | Fresh context is the dream — Codex just happens to provide it |
| “Keep CLAUDE.md detailed and growing” | Every token costs cache reads per turn, research says it hurts performance |
| “1M context for complex work” | My 618-turn session only hit 493K, costing $0.25/turn in cache read |
| “Add rules when Claude makes mistakes” | More rules = more cache = more cost = still ignored |
| “Start sessions from your phone” | Overnight cache expiry = 125% rebuild on first message |
| “5 parallel sessions for productivity” | 5x the cache overhead on your quota |
| “Auto-compact handles long sessions” | Compaction is lossy + triggers 125% rebuilds + proven to cause 100% hallucination rate |
| “20x more usage than Pro” | Cache overhead eats 99.93% of quota |
| “Opus 4.6 is the best coding model” | It absolutely is — until cache degrades it after 200 turns |
| “The model is hallucinating” | No — it’s cache behavioral inertia, not hallucination |
One More Thing
Every number in this post was verified using ccusage_go. It parses your local JSONL session logs and shows you exactly where your tokens go — including the Cache Create / Cache Read / API Cost breakdown that Anthropic doesn’t show you.
git clone https://github.com/SDpower/ccusage_go.git
cd ccusage_go && make build
./ccusage_go blocks # See real costs per 5-hour block
./ccusage_go session # Per-session with subagent detail
The truth is in your ~/.claude/projects/ folder. Go look.
Full technical analysis: The Cache Trap
SteveLo is a full-stack systems engineer based in Taiwan with expertise spanning RF/SDR engineering, FPGA development, embedded systems (IoT/MCU), high-performance computing, and software architecture. He maintains ccusage_go on GitHub.