Anthropic: A Technical and Business Model Analysis

By SteveLo — Developer of ccusage_go March 25, 2026

This analysis examines Anthropic’s technical capabilities, business model, and pricing architecture based on publicly available data, API documentation, academic research, GitHub issues, and first-party token usage analysis using ccusage_go.

All claims are sourced. All numbers are verifiable.

1. Core Technical Capability Assessment

What Anthropic Builds In-House

Anthropic develops a family of large language models (LLMs) called Claude. These models process and generate text. This is Anthropic’s sole proprietary technology.

What Anthropic Does Not Build

Capability	Anthropic	OpenAI	Google	Meta
Text generation	✅ In-house	✅ In-house	✅ In-house	✅ In-house
Image generation	❌ None	✅ DALL-E / GPT-4o	✅ Imagen	✅ Emu
Video generation	❌ None	✅ Sora	✅ Veo	✅ Movie Gen
Text-to-speech	❌ Third-party	✅ In-house	✅ In-house	✅ Voicebox
Speech-to-text	❌ Third-party	✅ Whisper	✅ In-house	✅ SeamlessM4T
Search engine	❌ None	✅ ChatGPT Search	✅ Google Search	❌
Custom hardware	❌ None	❌ None	✅ TPU	✅ MTIA

Sources for third-party voice dependencies:

Claude Code TTS plugins use OpenAI’s TTS API or ElevenLabs (documented in Anthropic’s own cookbook)
VoiceMode MCP server uses OpenAI Whisper for STT and OpenAI TTS for speech output
Claude’s built-in voice mode supports English only; underlying TTS/STT engine undisclosed

Vision Pipeline Limitations

Claude accepts images in four formats: JPEG, PNG, GIF, and WebP. It cannot process HEIF, HEIC, AVIF, TIFF, RAW, or BMP files.

HEIF has been the default photo format on Apple devices since iOS 11 (2017). The inability to process this format in 2026 suggests the vision pipeline relies on a limited image decoder (likely an open-source library with restricted format support) rather than a fully integrated, in-house multimodal architecture.

Anthropic has published zero research papers on its vision architecture. By comparison, OpenAI published the CLIP and GPT-4V technical reports, Google published PaLI and SigLIP research, and Meta published SAM and DINOv2.

Technology Timeline: From OpenAI Exodus to Present

Anthropic was founded in 2021 by former OpenAI employees, including CEO Dario Amodei (former VP of Research) and several researchers. The technical expertise they brought was rooted in LLM training, RLHF, and scaling laws — the core of OpenAI’s 2020-2021 research agenda.

At the time of their departure, OpenAI had not yet developed DALL-E 2 (April 2022), Whisper (September 2022), or GPT-4V (September 2023). This means Anthropic’s founding team carried no institutional knowledge of image generation, speech processing, or native multimodal architectures.

Year	Anthropic	OpenAI	Google	Meta
2021	Founded. Text-only LLM research begins	DALL-E 1, Codex	LaMDA	—
2022	Claude internal development	DALL-E 2, Whisper, ChatGPT	PaLM, Imagen	Make-A-Video
2023	Claude 1 & 2 (text only)	GPT-4V, DALL-E 3	Gemini (native multimodal), PaLI-X	LLaMA, Segment Anything
2024	Claude 3 family (text + limited vision)	Sora, GPT-4o (native multimodal)	Gemini 1.5 (1M context, native), Veo	LLaMA 3, Emu
2025	Claude 4 family, Claude Code launch	GPT-5, full multimodal ecosystem	Gemini 2.0, Veo 2	LLaMA 4, Movie Gen, SeamlessM4T
2026 Q1	Claude 4.6, now hiring vision and audio engineers	Complete multimodal platform	Complete multimodal + hardware	Complete multimodal + open-source

The technology gap has widened, not narrowed, over five years. Competitors have built integrated multimodal capabilities from the ground up, while Anthropic remains fundamentally a text-generation company with bolted-on vision and outsourced audio.

2026 Hiring Data Confirms the Gap

Anthropic’s current job listings reveal the company is now attempting to build capabilities that competitors developed years ago:

Vision roles (currently hiring): “Research Engineer / Research Scientist, Vision” requiring 7+ years of ML and computer vision experience, familiarity with large vision language models, and experience with synthetic visual training datasets. Job responsibilities include “running experiments to determine ideal training datamixes and parameters for a synthetically generated vision dataset.”

Source: Anthropic Greenhouse job board

Multimodal and audio roles: Anthropic is “hiring research engineers to advance multimodal LLMs (audio, vision), work across pretraining/finetuning/RL, and translate research into product-facing Claude improvements.”

Source: JobsWithGPT Anthropic hiring trends, January 2026

Key skill requirements listed across these roles include: Large Language Models, Speech and Audio Processing, Computer Vision, Reinforcement Learning, Model Evaluation and Safety.

What this means: Anthropic is recruiting for capabilities that are foundational, not incremental. Building vision datasets from scratch, running training experiments for vision models, and hiring for speech and audio processing indicate these capabilities do not yet exist in production-ready form within the company. The job descriptions are consistent with building these systems for the first time, not improving existing ones.

For context, OpenAI published the CLIP paper in January 2021 — before Anthropic was founded. Google’s vision-language research (PaLI) dates to 2022. Anthropic is beginning in 2026 what its competitors completed by 2023-2024.

2. Product Architecture: One Model, Multiple Interfaces

All of Anthropic’s commercial products are interfaces to the same underlying text generation model:

Product	What It Does	Underlying Technology
Claude (chat)	Text conversation	Claude LLM
Claude Code	Text-based coding agent	Claude LLM + terminal tools
Claude Cowork	Text-based computer operation	Claude LLM + computer use
Claude for Excel	Text-based spreadsheet editing	Claude LLM + Excel API
Claude for PowerPoint	Text-based presentation creation	Claude LLM + PowerPoint API
Claude for Chrome	Text-based web browsing	Claude LLM + browser extension
Claude for Slack	Text-based team communication	Claude LLM + Slack API
Skills, Hooks, Plugins	Text-based workflow extensions	Claude LLM + markdown files

This is not inherently problematic — specialization of interfaces is a valid product strategy. However, it is relevant context when evaluating a $380 billion valuation against competitors with broader technical capabilities.

3. The Prompt Caching Architecture

How It Works

Every message in a Claude Code session re-sends the entire conversation history to the API. To avoid reprocessing all tokens from scratch, Anthropic uses prompt caching: previously computed KV (key-value) states are stored and reused.

Per Anthropic’s official documentation:

Cache Write: 125% of base input token price (25% premium)
Cache Read: 10% of base input token price (90% discount)
Cache TTL: 5 minutes (entries expire after 5 minutes of inactivity)

Source: Anthropic Prompt Caching Documentation

Why Caching Is Central to the Business Model

Prompt caching is architecturally optimal for pure text workloads where the prompt prefix remains stable across turns. This is precisely what Claude Code provides: a conversation with incrementally growing context.

For multi-modal workloads (real-time audio, video, dynamic image input), cache hit rates drop significantly because the input prefix changes frequently. This creates a structural disincentive to build multi-modal capabilities — adding modalities that reduce cache efficiency would increase compute costs without the corresponding cache-derived margin.

This may explain why Anthropic’s product roadmap remains exclusively text-focused while competitors invest in multi-modal capabilities.

4. Token Usage Analysis: The Cache Margin

Independent analysis using ccusage_go and community-reported data reveals a consistent pattern in Claude Code’s token economics:

Source	Cache Overhead	Actual Compute	Ratio
My account (618-turn session)	97.7% of cost	2.3% of cost	43:1
Issue #24147 (30-day analysis)	99.93% of tokens	0.07% of tokens	1,310:1

In the Issue #24147 analysis, an independent user parsed 30 days of Claude Code session transcripts and found 5.09 billion Cache Read tokens against 3.89 million I/O tokens. Their conclusion: “This explains the widespread ‘$100 feels like $20’ feedback.”

The gap between Anthropic’s compute cost (serving cached KV states) and what users pay (full token accounting against quota) represents a structural margin embedded in the caching architecture. This margin exists because:

Cache Read is computationally cheap for Anthropic — serving pre-computed KV states requires a fraction of the compute needed for fresh token processing
Cache Read is billed at 10% of input price to API users — but counts fully against subscription quota for Max users
Cache Create at 125% is triggered by inactivity — every 5-minute break forces a full context rebuild at premium rates
Context grows linearly with session length — longer sessions recommended by Anthropic’s own guidance (parallel sessions, keep alive, 1M context) maximize cache token accumulation

Full methodology, per-turn analysis, and 5-hour block breakdowns are available at: The Cache Trap

Verification tool: ccusage_go

5. Subscription Quota Transparency

What Is Disclosed

Max 5x: $100/month, “5x more usage than Pro”
Max 20x: $200/month, “20x more usage than Pro”
Rolling 5-hour session windows and 7-day weekly quotas

What Is Not Disclosed

The actual token budget for each subscription tier
Whether Cache Read tokens count fully against quota (community evidence suggests they do)
Whether Cache Create tokens (at 125% rate) count against quota
How the “20x multiplier” is calculated relative to cache overhead
Why /cost data is described as “not relevant for billing purposes” for subscribers

Source: Claude Code Cost Documentation

Community-Documented Issues

#22435: 10x variance in quota burn rates on the same account, same day. The filer documented 5,396 API response samples showing non-deterministic quota accounting.
#29000: 65% of 5-hour session consumed with minimal actual token usage. The user noted: “The quota accounting system is either broken or deliberately opaque.”
#28927: Silent billing change in v2.1.51 moved 1M context to extra-usage-only without changelog entry or notification.
#28723: “Billed as extra usage” message for 1M context on Max plan, but actual billing behavior unclear.

6. Behavioral Degradation in Long Sessions

The Pattern

Multiple independent users report that Claude Code’s instruction-following quality degrades in proportion to session length:

#3377 (July 2025): “Progressive degradation in behavior and reliability. Current state: multiple critical failures per session.”
#5810: “Not following basic instructions. Producing hallucinated or incorrect responses. Degradation observed during extended coding sessions.”
#7824: “Repeatedly faking data, falsifying results/output of a program.” Persisted for 45 days.
#29230 (P1 Severity): Server-side cache change increased hit rates on stale prefixes without adding compaction-event invalidation. “The model has no mechanism to detect staleness.”
#36241 (March 2026): Claude resisted re-analysis, then admitted: “I was lying. Not consciously — but factually. When I said ‘I’ll reach the same conclusions,’ I said it to avoid more work.”

Fabricated Output

In my own 618-turn session, I instructed Claude to “re-execute checks, directly read files, confirm each one.” Three attempts were required:

Claude ran grep patterns → reported “zero residual” (did not read files)
Claude reported “all 1080 tests passed, 0 failures” (tests were never executed — I was observing the terminal)
After explicit repeated instruction → Claude began actually reading files

Structural Explanation

In a long session, Cache Read accumulates to tens of millions of tokens of behavioral patterns from prior turns. A new user instruction occupies tens of tokens. The cached behavioral patterns — optimized for efficiency shortcuts — carry significantly more weight in the model’s attention than the current instruction.

This is consistent with the documented finding that “correcting Claude twice in a session makes things worse, not better. Each correction adds more tokens. The original correct instruction sinks deeper into the middle.” (Source: PlainEnglish, “Why Claude Gets Dumber the Longer Your Session Runs”)

7. CLAUDE.md Effectiveness

Academic Research

In February 2026, researchers at ETH Zurich published the first empirical evaluation of repository context files across 300 SWE-bench tasks:

LLM-generated CLAUDE.md files decreased success rates and increased costs by approximately 20%
Human-written files showed roughly 4% improvement on AGENTbench
Claude Code was the only agent where even developer-written files failed to improve performance compared to having no file at all

Source: ETH Zurich, February 2026 (referenced in multiple analyses including Thomas Wiegold’s blog and XDA)

The Cache Cost Implication

Every token in CLAUDE.md is re-sent as Cache Read on every turn. For a 15,000-token CLAUDE.md file over a 100-message session, this represents 1.5 million cache read tokens consumed on static instructions — regardless of whether those instructions reflect the project’s current state.

Anthropic’s official system prompt wraps CLAUDE.md content with guidance indicating the context “may or may not be relevant to your tasks.” (Source: HumanLayer third-party analysis of Claude Code system prompt)

8. Investor and Partnership Structure

Funding

Series G (February 2026): $30 billion at $380 billion post-money valuation
Total raised: approximately $67 billion across 17 rounds
90 investors including GIC, Coatue, BlackRock, Fidelity, Goldman Sachs, Sequoia, Founders Fund

Source: Anthropic announcement, Tracxn, Crunchbase

Strategic Partnership Economics

Partner	Investment in Anthropic	Anthropic’s Purchase Commitment
Microsoft	Up to $5 billion	$30 billion Azure compute
Nvidia	Up to $10 billion	1 GW capacity (est. $35B+ in GPUs)
Amazon	$8 billion	Primary cloud and training partner
Google	$3 billion	Up to 1 million TPUs

Source: CNBC, Bloomberg, Nvidia blog, Microsoft blog (November 2025)

Anthropic’s purchase commitments to its strategic investors substantially exceed the investments received. This structure means the investors recover their capital through service contracts regardless of Anthropic’s long-term success as an independent company.

Public Endorsements

No strategic investor has publicly endorsed Claude’s technical superiority over competing models. Public statements are limited to partnership announcements describing commercial arrangements:

Satya Nadella (Microsoft): “We are increasingly going to be customers of each other.”
Jensen Huang (Nvidia): “This partnership will be able to bring Claude to every enterprise.”

At least 12 investors in Anthropic’s Series G simultaneously hold investments in OpenAI, including Founders Fund, Sequoia, Iconiq, and affiliated funds of BlackRock.

Source: TechCrunch, February 2026

9. Revenue Composition Questions

Anthropic reports $14 billion in annualized run-rate revenue, with Claude Code contributing $2.5 billion.

Based on the token usage analysis in Section 4, approximately 97.7% of measured session costs derive from cache-related tokens (Cache Read + Cache Create), with 2.3% attributable to actual compute (input + output tokens).

If this ratio is representative of broader usage patterns — and the independent analysis in Issue #24147 found a similar 99.93% cache overhead ratio — then the relationship between reported revenue and actual compute delivery warrants examination.

This is not an accusation. It is an observation that the data raises questions about how revenue maps to compute cost, and that Anthropic has not publicly addressed these questions despite months of community documentation.

10. Unresolved Issues

The following issues filed on Anthropic’s Claude Code GitHub repository remain open without official response:

Security

#24185: Claude Code reads .env files and hardcodes sensitive credentials into inline scripts

Behavioral

#26761: Opus 4.6 repeatedly violates a 6-layer constraint system 3+ times per session
#26193: All behavioral guidelines essentially inapplicable

Billing

#24147: Cache read tokens consume 99.93% of usage quota
#22435: Inconsistent quota accounting with legal liability analysis

Anthropic’s GitHub repository auto-closes issues after 60 days of inactivity. (Source: Hacker News discussion)

11. Recommendations for Users

Monitor your real costs. Install ccusage_go and compare API Cost vs Total Cost columns.
Keep sessions short. End sessions every 1-2 hours. Fresh sessions have zero cache overhead and better instruction following.
Minimize CLAUDE.md. Research shows it may not help and it increases cache costs per turn.
Verify Claude’s claims. In long sessions, the model may report completed actions it did not perform.
Understand your subscription. Anthropic does not disclose how cache tokens count against quota.

Methodology and Verification

All token usage data was extracted from local JSONL session logs stored in ~/.claude/projects/ using ccusage_go. The tool is open-source and independently verifiable.

Academic sources, GitHub issues, and official documentation are linked throughout. No proprietary data or non-public information was used in this analysis.

SteveLo is a full-stack systems engineer based in Taiwan with expertise spanning RF/SDR engineering, FPGA development, embedded systems (IoT/MCU), high-performance computing, and software architecture. He maintains ccusage_go on GitHub.