OpenAI Codex now bills by token, not by seat: what it means for dev teams

Maroš Bednár

April 3, 2026

4 min read

Abstract AI visualization with neural network patterns

$375 a year for something half your team never opens. That was the reality of ChatGPT Business if you wanted to give developers access to Codex. But it also describes a broader pattern: every AI code tool launched with flat per-seat pricing, and every one of them is now converging on consumption-based billing. OpenAI's April 2 announcement is the latest domino.

The market signal matters more than the pricing details. GitHub Copilot moved to metered billing after years of flat $19/seat. Claude Code launched pay-as-you-go from day one. Cursor still charges $20/month flat and is starting to look like the outlier. When three out of four major players bill by usage, that's the new default.

What OpenAI actually changed

Two access tiers now. Full ChatGPT Business seat at $20/month (cut from $25) -- includes ChatGPT and Codex with plan-based rate limits. And a new Codex-only seat with zero monthly fee, pure token billing, no rate limits.

The second tier is the one that changes team economics. A developer running two code reviews a day through Codex might spend $5 a month. Someone refactoring a large codebase might spend $50. Nobody pays for ChatGPT access they don't want.

The math that CTOs actually care about

Take a 10-person dev team. Old model: 10 seats at $25 = $3,000/year. New model: 5 full seats for people who use ChatGPT daily ($100/month) plus 5 Codex-only seats ($15-40/month per person based on actual usage). Annual cost drops to roughly $2,100-3,600 depending on consumption. Or go aggressive: everyone on Codex-only, fixed costs near zero, pure usage billing.

The savings aren't massive. What changes is the procurement conversation. You don't need budget approval for a $3,000 annual commitment before knowing whether the tool helps. You turn it on. You look at the bill after a month. That's a fundamentally different conversation with finance.

Their growth numbers back up the demand: 2 million weekly users, 6x growth since January, 9 million paying Business customers. New users get $100 in credits ($500 per team cap) -- enough for a real two-week test at zero cost.

The competitive landscape is sorting itself out

I've been tracking this space for the past year, and the pattern is clear. The tools that charge flat per-seat fees are losing to the ones that scale with usage. Copilot figured this out and adapted. Claude Code got it right from the start -- in my experience, it handles larger repository context better than anything else in this group, and its pricing model was the least friction from day one.

Codex's strength is different: it operates as a full agent, not autocomplete. It clones your repo into a sandbox, reads the code, makes changes, runs tests, creates a PR. That agent model works well for code review, pattern refactoring across files, library migrations, test generation, and bug fixes with clear stack traces. It's weaker on architecture decisions, domain-specific logic, and anything security-sensitive.

But here's my honest take: which tool you pick matters less than how your team works with it. A developer who writes a clear task specification and reviews the output carefully will get far more value from any of these tools than someone who writes a vague prompt and hopes. As prompt engineering gives way to agents, the skill shifts from crafting prompts to defining good tasks.

If your team isn't using any AI code tool yet, Codex-only seats are a low-risk entry. Zero fixed cost, free credits to start, no commitment. Let two or three developers try it on real tasks for two weeks. Look at token spend and time saved on specific task types. After a month, you have data instead of opinions.

If you're already on Copilot or Cursor, run a parallel comparison on your own codebase. Not feature lists -- actual tasks. Your repo, your review process, your ticket types. I personally lean toward Claude Code for projects with large, complex codebases where deep context matters, but every team's profile is different.

Want to test this on your own codebase? We run a 2-week pilot -- we set up the environment, define the workflow, and measure where AI actually saves your team time.

Back to blog

Robotic hand and human hand representing controlled AI automation

AI Agent Governance Checklist Before You Connect CRM, ERP, or Email

An AI agent can save hours, but only if permissions, logs, approvals, owners, and failure paths are designed before it touches production systems.

5 min read

Search interface and website analytics on a laptop screen

AI Overviews and AI Mode: SEO for a Business Website in 2026

Google says the basics still matter for AI Overviews and AI Mode. The difference is that weak content and technical debt have less room to hide.

5 min read

Team of developers collaborating around a desk with laptops

AI in software development works best with a human in the loop

Fully autonomous AI coding loops aren't reliable enough for production. The Vercel team confirmed what we see in practice: agents plus human review is the sweet spot.

5 min read

OpenAI Codex now bills by token, not by seat: what it means for dev teams

What OpenAI actually changed

The math that CTOs actually care about

The competitive landscape is sorting itself out

What I'd actually recommend

Read Next

AI Agent Governance Checklist Before You Connect CRM, ERP, or Email

AI Overviews and AI Mode: SEO for a Business Website in 2026

AI in software development works best with a human in the loop