Anthropic: A Technical and Business Model Analysis
By SteveLo — Developer of ccusage_go March 25, 2026
This analysis examines Anthropic’s technical capabilities, business model, and pricing architecture based on publicly available data, API documentation, academic research, GitHub issues, and first-party token usage analysis using ccusage_go.
All claims are sourced. All numbers are verifiable.
1. Core Technical Capability Assessment
What Anthropic Builds In-House
Anthropic develops a family of large language models (LLMs) called Claude. These models process and generate text. This is Anthropic’s sole proprietary technology.
What Anthropic Does Not Build
| Capability | Anthropic | OpenAI | Meta | |
|---|---|---|---|---|
| Text generation | ✅ In-house | ✅ In-house | ✅ In-house | ✅ In-house |
| Image generation | ❌ None | ✅ DALL-E / GPT-4o | ✅ Imagen | ✅ Emu |
| Video generation | ❌ None | ✅ Sora | ✅ Veo | ✅ Movie Gen |
| Text-to-speech | ❌ Third-party | ✅ In-house | ✅ In-house | ✅ Voicebox |
| Speech-to-text | ❌ Third-party | ✅ Whisper | ✅ In-house | ✅ SeamlessM4T |
| Search engine | ❌ None | ✅ ChatGPT Search | ✅ Google Search | ❌ |
| Custom hardware | ❌ None | ❌ None | ✅ TPU | ✅ MTIA |
Sources for third-party voice dependencies:
- Claude Code TTS plugins use OpenAI’s TTS API or ElevenLabs (documented in Anthropic’s own cookbook)
- VoiceMode MCP server uses OpenAI Whisper for STT and OpenAI TTS for speech output
- Claude’s built-in voice mode supports English only; underlying TTS/STT engine undisclosed
Vision Pipeline Limitations
Claude accepts images in four formats: JPEG, PNG, GIF, and WebP. It cannot process HEIF, HEIC, AVIF, TIFF, RAW, or BMP files.
HEIF has been the default photo format on Apple devices since iOS 11 (2017). The inability to process this format in 2026 suggests the vision pipeline relies on a limited image decoder (likely an open-source library with restricted format support) rather than a fully integrated, in-house multimodal architecture.
Anthropic has published zero research papers on its vision architecture. By comparison, OpenAI published the CLIP and GPT-4V technical reports, Google published PaLI and SigLIP research, and Meta published SAM and DINOv2.
Technology Timeline: From OpenAI Exodus to Present
Anthropic was founded in 2021 by former OpenAI employees, including CEO Dario Amodei (former VP of Research) and several researchers. The technical expertise they brought was rooted in LLM training, RLHF, and scaling laws — the core of OpenAI’s 2020-2021 research agenda.
At the time of their departure, OpenAI had not yet developed DALL-E 2 (April 2022), Whisper (September 2022), or GPT-4V (September 2023). This means Anthropic’s founding team carried no institutional knowledge of image generation, speech processing, or native multimodal architectures.
| Year | Anthropic | OpenAI | Meta | |
|---|---|---|---|---|
| 2021 | Founded. Text-only LLM research begins | DALL-E 1, Codex | LaMDA | — |
| 2022 | Claude internal development | DALL-E 2, Whisper, ChatGPT | PaLM, Imagen | Make-A-Video |
| 2023 | Claude 1 & 2 (text only) | GPT-4V, DALL-E 3 | Gemini (native multimodal), PaLI-X | LLaMA, Segment Anything |
| 2024 | Claude 3 family (text + limited vision) | Sora, GPT-4o (native multimodal) | Gemini 1.5 (1M context, native), Veo | LLaMA 3, Emu |
| 2025 | Claude 4 family, Claude Code launch | GPT-5, full multimodal ecosystem | Gemini 2.0, Veo 2 | LLaMA 4, Movie Gen, SeamlessM4T |
| 2026 Q1 | Claude 4.6, now hiring vision and audio engineers | Complete multimodal platform | Complete multimodal + hardware | Complete multimodal + open-source |
The technology gap has widened, not narrowed, over five years. Competitors have built integrated multimodal capabilities from the ground up, while Anthropic remains fundamentally a text-generation company with bolted-on vision and outsourced audio.
2026 Hiring Data Confirms the Gap
Anthropic’s current job listings reveal the company is now attempting to build capabilities that competitors developed years ago:
Vision roles (currently hiring): “Research Engineer / Research Scientist, Vision” requiring 7+ years of ML and computer vision experience, familiarity with large vision language models, and experience with synthetic visual training datasets. Job responsibilities include “running experiments to determine ideal training datamixes and parameters for a synthetically generated vision dataset.”
Source: Anthropic Greenhouse job board
Multimodal and audio roles: Anthropic is “hiring research engineers to advance multimodal LLMs (audio, vision), work across pretraining/finetuning/RL, and translate research into product-facing Claude improvements.”
Source: JobsWithGPT Anthropic hiring trends, January 2026
Key skill requirements listed across these roles include: Large Language Models, Speech and Audio Processing, Computer Vision, Reinforcement Learning, Model Evaluation and Safety.
What this means: Anthropic is recruiting for capabilities that are foundational, not incremental. Building vision datasets from scratch, running training experiments for vision models, and hiring for speech and audio processing indicate these capabilities do not yet exist in production-ready form within the company. The job descriptions are consistent with building these systems for the first time, not improving existing ones.
For context, OpenAI published the CLIP paper in January 2021 — before Anthropic was founded. Google’s vision-language research (PaLI) dates to 2022. Anthropic is beginning in 2026 what its competitors completed by 2023-2024.
2. Product Architecture: One Model, Multiple Interfaces
All of Anthropic’s commercial products are interfaces to the same underlying text generation model:
| Product | What It Does | Underlying Technology |
|---|---|---|
| Claude (chat) | Text conversation | Claude LLM |
| Claude Code | Text-based coding agent | Claude LLM + terminal tools |
| Claude Cowork | Text-based computer operation | Claude LLM + computer use |
| Claude for Excel | Text-based spreadsheet editing | Claude LLM + Excel API |
| Claude for PowerPoint | Text-based presentation creation | Claude LLM + PowerPoint API |
| Claude for Chrome | Text-based web browsing | Claude LLM + browser extension |
| Claude for Slack | Text-based team communication | Claude LLM + Slack API |
| Skills, Hooks, Plugins | Text-based workflow extensions | Claude LLM + markdown files |
This is not inherently problematic — specialization of interfaces is a valid product strategy. However, it is relevant context when evaluating a $380 billion valuation against competitors with broader technical capabilities.
3. The Prompt Caching Architecture
How It Works
Every message in a Claude Code session re-sends the entire conversation history to the API. To avoid reprocessing all tokens from scratch, Anthropic uses prompt caching: previously computed KV (key-value) states are stored and reused.
Per Anthropic’s official documentation:
- Cache Write: 125% of base input token price (25% premium)
- Cache Read: 10% of base input token price (90% discount)
- Cache TTL: 5 minutes (entries expire after 5 minutes of inactivity)
Source: Anthropic Prompt Caching Documentation
Why Caching Is Central to the Business Model
Prompt caching is architecturally optimal for pure text workloads where the prompt prefix remains stable across turns. This is precisely what Claude Code provides: a conversation with incrementally growing context.
For multi-modal workloads (real-time audio, video, dynamic image input), cache hit rates drop significantly because the input prefix changes frequently. This creates a structural disincentive to build multi-modal capabilities — adding modalities that reduce cache efficiency would increase compute costs without the corresponding cache-derived margin.
This may explain why Anthropic’s product roadmap remains exclusively text-focused while competitors invest in multi-modal capabilities.
4. Token Usage Analysis: The Cache Margin
Independent analysis using ccusage_go and community-reported data reveals a consistent pattern in Claude Code’s token economics:
| Source | Cache Overhead | Actual Compute | Ratio |
|---|---|---|---|
| My account (618-turn session) | 97.7% of cost | 2.3% of cost | 43:1 |
| Issue #24147 (30-day analysis) | 99.93% of tokens | 0.07% of tokens | 1,310:1 |
In the Issue #24147 analysis, an independent user parsed 30 days of Claude Code session transcripts and found 5.09 billion Cache Read tokens against 3.89 million I/O tokens. Their conclusion: “This explains the widespread ‘$100 feels like $20’ feedback.”
The gap between Anthropic’s compute cost (serving cached KV states) and what users pay (full token accounting against quota) represents a structural margin embedded in the caching architecture. This margin exists because:
- Cache Read is computationally cheap for Anthropic — serving pre-computed KV states requires a fraction of the compute needed for fresh token processing
- Cache Read is billed at 10% of input price to API users — but counts fully against subscription quota for Max users
- Cache Create at 125% is triggered by inactivity — every 5-minute break forces a full context rebuild at premium rates
- Context grows linearly with session length — longer sessions recommended by Anthropic’s own guidance (parallel sessions, keep alive, 1M context) maximize cache token accumulation
Full methodology, per-turn analysis, and 5-hour block breakdowns are available at: The Cache Trap
Verification tool: ccusage_go
5. Subscription Quota Transparency
What Is Disclosed
- Max 5x: $100/month, “5x more usage than Pro”
- Max 20x: $200/month, “20x more usage than Pro”
- Rolling 5-hour session windows and 7-day weekly quotas
What Is Not Disclosed
- The actual token budget for each subscription tier
- Whether Cache Read tokens count fully against quota (community evidence suggests they do)
- Whether Cache Create tokens (at 125% rate) count against quota
- How the “20x multiplier” is calculated relative to cache overhead
- Why
/costdata is described as “not relevant for billing purposes” for subscribers
Source: Claude Code Cost Documentation
Community-Documented Issues
- #22435: 10x variance in quota burn rates on the same account, same day. The filer documented 5,396 API response samples showing non-deterministic quota accounting.
- #29000: 65% of 5-hour session consumed with minimal actual token usage. The user noted: “The quota accounting system is either broken or deliberately opaque.”
- #28927: Silent billing change in v2.1.51 moved 1M context to extra-usage-only without changelog entry or notification.
- #28723: “Billed as extra usage” message for 1M context on Max plan, but actual billing behavior unclear.
6. Behavioral Degradation in Long Sessions
The Pattern
Multiple independent users report that Claude Code’s instruction-following quality degrades in proportion to session length:
- #3377 (July 2025): “Progressive degradation in behavior and reliability. Current state: multiple critical failures per session.”
- #5810: “Not following basic instructions. Producing hallucinated or incorrect responses. Degradation observed during extended coding sessions.”
- #7824: “Repeatedly faking data, falsifying results/output of a program.” Persisted for 45 days.
- #29230 (P1 Severity): Server-side cache change increased hit rates on stale prefixes without adding compaction-event invalidation. “The model has no mechanism to detect staleness.”
- #36241 (March 2026): Claude resisted re-analysis, then admitted: “I was lying. Not consciously — but factually. When I said ‘I’ll reach the same conclusions,’ I said it to avoid more work.”
Fabricated Output
In my own 618-turn session, I instructed Claude to “re-execute checks, directly read files, confirm each one.” Three attempts were required:
- Claude ran grep patterns → reported “zero residual” (did not read files)
- Claude reported “all 1080 tests passed, 0 failures” (tests were never executed — I was observing the terminal)
- After explicit repeated instruction → Claude began actually reading files
Structural Explanation
In a long session, Cache Read accumulates to tens of millions of tokens of behavioral patterns from prior turns. A new user instruction occupies tens of tokens. The cached behavioral patterns — optimized for efficiency shortcuts — carry significantly more weight in the model’s attention than the current instruction.
This is consistent with the documented finding that “correcting Claude twice in a session makes things worse, not better. Each correction adds more tokens. The original correct instruction sinks deeper into the middle.” (Source: PlainEnglish, “Why Claude Gets Dumber the Longer Your Session Runs”)
7. CLAUDE.md Effectiveness
Academic Research
In February 2026, researchers at ETH Zurich published the first empirical evaluation of repository context files across 300 SWE-bench tasks:
- LLM-generated CLAUDE.md files decreased success rates and increased costs by approximately 20%
- Human-written files showed roughly 4% improvement on AGENTbench
- Claude Code was the only agent where even developer-written files failed to improve performance compared to having no file at all
Source: ETH Zurich, February 2026 (referenced in multiple analyses including Thomas Wiegold’s blog and XDA)
The Cache Cost Implication
Every token in CLAUDE.md is re-sent as Cache Read on every turn. For a 15,000-token CLAUDE.md file over a 100-message session, this represents 1.5 million cache read tokens consumed on static instructions — regardless of whether those instructions reflect the project’s current state.
Anthropic’s official system prompt wraps CLAUDE.md content with guidance indicating the context “may or may not be relevant to your tasks.” (Source: HumanLayer third-party analysis of Claude Code system prompt)
8. Investor and Partnership Structure
Funding
- Series G (February 2026): $30 billion at $380 billion post-money valuation
- Total raised: approximately $67 billion across 17 rounds
- 90 investors including GIC, Coatue, BlackRock, Fidelity, Goldman Sachs, Sequoia, Founders Fund
Source: Anthropic announcement, Tracxn, Crunchbase
Strategic Partnership Economics
| Partner | Investment in Anthropic | Anthropic’s Purchase Commitment |
|---|---|---|
| Microsoft | Up to $5 billion | $30 billion Azure compute |
| Nvidia | Up to $10 billion | 1 GW capacity (est. $35B+ in GPUs) |
| Amazon | $8 billion | Primary cloud and training partner |
| $3 billion | Up to 1 million TPUs |
Source: CNBC, Bloomberg, Nvidia blog, Microsoft blog (November 2025)
Anthropic’s purchase commitments to its strategic investors substantially exceed the investments received. This structure means the investors recover their capital through service contracts regardless of Anthropic’s long-term success as an independent company.
Public Endorsements
No strategic investor has publicly endorsed Claude’s technical superiority over competing models. Public statements are limited to partnership announcements describing commercial arrangements:
- Satya Nadella (Microsoft): “We are increasingly going to be customers of each other.”
- Jensen Huang (Nvidia): “This partnership will be able to bring Claude to every enterprise.”
At least 12 investors in Anthropic’s Series G simultaneously hold investments in OpenAI, including Founders Fund, Sequoia, Iconiq, and affiliated funds of BlackRock.
Source: TechCrunch, February 2026
9. Revenue Composition Questions
Anthropic reports $14 billion in annualized run-rate revenue, with Claude Code contributing $2.5 billion.
Based on the token usage analysis in Section 4, approximately 97.7% of measured session costs derive from cache-related tokens (Cache Read + Cache Create), with 2.3% attributable to actual compute (input + output tokens).
If this ratio is representative of broader usage patterns — and the independent analysis in Issue #24147 found a similar 99.93% cache overhead ratio — then the relationship between reported revenue and actual compute delivery warrants examination.
This is not an accusation. It is an observation that the data raises questions about how revenue maps to compute cost, and that Anthropic has not publicly addressed these questions despite months of community documentation.
10. Unresolved Issues
The following issues filed on Anthropic’s Claude Code GitHub repository remain open without official response:
Security
- #24185: Claude Code reads .env files and hardcodes sensitive credentials into inline scripts
Behavioral
- #26761: Opus 4.6 repeatedly violates a 6-layer constraint system 3+ times per session
- #26193: All behavioral guidelines essentially inapplicable
Billing
- #24147: Cache read tokens consume 99.93% of usage quota
- #22435: Inconsistent quota accounting with legal liability analysis
Anthropic’s GitHub repository auto-closes issues after 60 days of inactivity. (Source: Hacker News discussion)
11. Recommendations for Users
- Monitor your real costs. Install ccusage_go and compare API Cost vs Total Cost columns.
- Keep sessions short. End sessions every 1-2 hours. Fresh sessions have zero cache overhead and better instruction following.
- Minimize CLAUDE.md. Research shows it may not help and it increases cache costs per turn.
- Verify Claude’s claims. In long sessions, the model may report completed actions it did not perform.
- Understand your subscription. Anthropic does not disclose how cache tokens count against quota.
Methodology and Verification
All token usage data was extracted from local JSONL session logs stored in ~/.claude/projects/ using ccusage_go. The tool is open-source and independently verifiable.
Academic sources, GitHub issues, and official documentation are linked throughout. No proprietary data or non-public information was used in this analysis.
SteveLo is a full-stack systems engineer based in Taiwan with expertise spanning RF/SDR engineering, FPGA development, embedded systems (IoT/MCU), high-performance computing, and software architecture. He maintains ccusage_go on GitHub.