A comprehensive study by Nous Research has found that open-source artificial intelligence (AI) models consume significantly more computing resources than their closed-source competitors, potentially undermining the cost advantages often associated with open solutions.
Open Source vs. Closed Source: The Cost Debate
For years, enterprises have viewed open-source AI models as the more affordable option. The assumption has been simple: because these models are free to access and cheaper per token—the fundamental computational unit—they should offer superior cost savings.
But a new study released Wednesday by Nous Research challenges this belief. According to the findings, open-weight models use 1.5 to 4 times more tokens than closed models such as those developed by OpenAI and Anthropic. In some cases, particularly with simple knowledge queries, the gap widens dramatically, with open models consuming up to 10 times more tokens.
As the report states:
“Open weight models use 1.5–4× more tokens than closed ones (up to 10× for simple knowledge questions), making them sometimes more expensive per query despite lower per-token costs.”
The results highlight a reality that may shift how enterprises evaluate AI deployments: efficiency matters just as much as upfront costs.
Token Efficiency: The Overlooked Metric
The study assessed 19 different AI models, testing them across basic knowledge queries, mathematical problems, and logic puzzles. The central metric was token efficiency—the number of tokens required to solve a given task relative to its complexity.
While often ignored in cost comparisons, token efficiency proved to be a decisive factor.
- Mathematics & logic problems: Open-source models used roughly twice as many tokens as closed systems.
- Simple knowledge queries: Inefficiencies ballooned, with open models requiring up to 10× more tokens.
- Large Reasoning Models (LRMs): These were especially inefficient, sometimes consuming thousands of tokens for questions that should require only a handful.
Consider a simple example. When asked, “What is the capital of Australia?”, some open models engaged in extended chains of reasoning before answering “Canberra,” leading to hundreds of tokens used unnecessarily.
As the researchers put it:
“While hosting open weight models may be cheaper, this cost advantage could be easily offset if they require more tokens to reason about a given problem.”
Winners and Losers: Which Models Came Out Ahead?
The analysis revealed stark contrasts in how AI providers handle efficiency.
- OpenAI’s systems—including the o4-mini and its gpt-oss variants—were praised for their exceptional token efficiency, particularly in solving math problems. In some cases, OpenAI’s models were three times more efficient than competitors.
- Among open-source projects, Nvidia’s llama-3.3-nemotron-super-49b-v1 stood out as the most efficient across domains.
- Newer open-source models, however, often showed growing inefficiencies, likely because developers are prioritizing complex reasoning abilities over streamlined performance.
This suggests that while some open models are improving, the broader trend is moving toward higher token usage.
Why Efficiency Outweighs Per-Token Pricing
For enterprise decision-makers, the message is clear: cheaper does not always mean less expensive in the long run.
Closed-source providers like OpenAI and Anthropic may charge more per token through API access, but their models often use far fewer tokens overall. That means the total bill per query can actually be lower compared to open-source options.
The researchers emphasized this point:
- Closed models: Higher upfront pricing, but optimized for efficiency.
- Open models: Lower per-token cost, but prone to ballooning token usage.
This trade-off may explain why closed-weight model providers continue to dominate enterprise adoption despite ongoing enthusiasm for open alternatives.
The Technical Hurdles of Measuring Efficiency
Measuring efficiency was not straightforward. Closed-source providers rarely disclose their reasoning chains, opting instead to compress outputs to protect proprietary methods.
To address this, researchers used completion tokens—the total units billed for a query—as a reliable proxy for measuring reasoning effort. While not a perfect window into model behavior, it offered enough consistency to make valid comparisons.
This approach revealed a critical insight: closed models have been systematically optimized to reduce unnecessary computation, while open models are trending in the opposite direction.
Implications for Businesses and Developers
The findings raise important questions for companies investing in AI:
- Are enterprises overestimating the cost advantages of open-source models?
- Should token efficiency become a standard benchmark alongside accuracy and latency?
- Could rising inefficiencies slow adoption of open-weight systems in high-volume enterprise environments?
The study suggests that businesses must begin factoring in total computational requirements when assessing AI options. A focus solely on accuracy benchmarks and per-token pricing risks overlooking the hidden costs of inefficiency.
For developers, the report underscores a pressing need to rebalance priorities. Advanced reasoning capabilities are valuable, but if they come at the expense of computational efficiency, enterprises may look elsewhere.
What Comes Next for the AI Industry?
One of the most intriguing developments highlighted in the report is the release of OpenAI’s gpt-oss models, which achieve state-of-the-art token efficiency while still making their reasoning chains visible. These models could serve as benchmarks for future optimization efforts across the open-source ecosystem.
The study also pointed to potential solutions, such as more efficient context usage to mitigate context degradation during long reasoning tasks. Improvements in this area could reduce wasted computation and restore some of the cost advantages associated with open systems.
Still, the broader industry faces a crossroads. As demand for AI solutions surges, efficiency may become as important a differentiator as accuracy or raw power.
Key Takeaways from the Study
- Open-source AI models used 1.5–4× more tokens than closed models in most tasks.
- For simple knowledge queries, some open models used up to 10× more tokens.
- OpenAI’s systems led in efficiency, with the o4-mini and gpt-oss models outperforming competitors.
- Nvidia’s llama-3.3-nemotron-super-49b-v1 was the most efficient open-source option.
- Inefficiency is especially evident in Large Reasoning Models, which over-extend chains of thought.
- Token efficiency should be a core metric for enterprises evaluating AI deployment strategies.
Conclusion: Rethinking the Economics of AI
The Nous Research study delivers a sobering reminder: the true cost of AI lies not just in per-token pricing, but in how efficiently models use those tokens.
For enterprises, this could mean reconsidering long-held assumptions about open versus closed solutions. For developers, it signals a shift in priorities—from simply building smarter models to building smarter, leaner, and more efficient ones.
As the industry continues to evolve, one question looms: Will the future of AI be shaped more by brilliance of thought, or by the economy of tokens?
Get the latest breakthroughs, tools, and tutorials—delivered straight to your inbox.