Stop Waiting for the Next Big Model (Like GPT-5). Build Reliable Foundations Instead.

▶️

Prefer watching over reading? Watch on YouTube

What caused me to write this article is the discussion I was part of where the CTO was saying: "Our AI pilots aren't delivering the ROI we expected. But GPT-5 is around the corner. That'll solve our workflow failures, right?"

Wrong. Dead wrong.

The GPT-5 rollout in August 2025 became the wake-up call that shattered this comfortable delusion. Users documented basic failures: Oregon labeled as "Onegon," simple math errors that would embarrass a calculator (a humbling reminder that even "PhD-level intelligence" can't spell geography). Reddit threads accumulated 4,600+ upvotes with 70% negative sentiment, forcing OpenAI to restore access to older models their own customers were begging to get back.

But here's what felt really painful to observe: companies that literally waited for GPT-5 to fix their fragile AI workflows saw minimal improvements when it arrived. Meanwhile, 95% of generative AI projects continue to fail at achieving meaningful revenue acceleration, according to MIT's NANDA Initiative.

That's not a technology problem. That's a foundations problem.

The Comfortable Lie Many Want to Believe

For sure I can imagine where this thinking comes from. Sam Altman has been remarkably vocal about revolutionary capabilities, and the hype feels intoxicating when you're staring at quarterly targets. The pattern was compelling: each leap from GPT-3 to GPT-3.5 to GPT-4 delivered genuine breakthroughs. Similar trajectory happened with Gemini's 2.5 series, which remains incomparably better than earlier generations.

However, the truth is: your AI failures aren't model limitations. They're architecture failures, data quality failures, and evaluation failures pretending to be model limitations.

I keep hearing executives rationalize fragile workflows with "GPT-6 will handle our edge cases better." Translation: "We'll outsource our core competency to a vendor's roadmap and hope for the best." That's not strategy. That sounds like avoiding responsibility.

The comfortable lie whispers that breakthrough models arrive every six months to rescue poorly designed systems. The uncomfortable truth? OpenAI hit $10B ARR growing 194% year-over-year by chasing investment rounds and infrastructure dominance, not by fixing your broken document processing workflow.

The Expensive Truth: What Actually Happens When You Wait

Let me share what happened to a Fintech startup whose Head of AI I know personally (details changed for confidentiality, but the pain was real). Their head of operations convinced leadership to delay investing in AI infrastructure because "GPT-5 will make everything work seamlessly." So, they spent 11 months in holding pattern.

GPT-5 finally launched. Their core workflow (processing compliance documents) remained fundamentally broken. Why? Because the model upgrade couldn't fix their garbage data pipeline, couldn't eliminate hallucinations on company-specific terminology, and couldn't resolve the fact that AI agent systems consume 4-15x more tokens than simple workflows with minimal reliability improvements.

Cost of waiting: close to $2.0M in delayed efficiency gains, plus an additional $800K in rushed infrastructure work to catch up to competitors who'd been building foundations continuously over the past year.

This pattern repeats everywhere. According to S&P Global, 42% of businesses scrapped most AI initiatives in 2024, up from 17% in 2023. These aren't technology failures – they're strategic failures dressed up as technology problems.

Recent industry analysis from Q3 2024 reveals that only 1% of companies consider their generative AI implementations "mature" according to McKinsey. The rest are stuck in what researchers call the "gen AI paradox": promising prototypes that break in production because they're built on quicksand.

The Hidden Pattern: Why Vendors Can't Save You

Look, here's the thing about vendor priorities that most executives miss: OpenAI isn't optimizing for your quarterly targets. They're optimizing for their next funding round. Google isn't chasing your workflow reliability—they're chasing reasoning supremacy. Anthropic leads on safety, not your specific edge cases.

If you tracked OpenAI leadership's public appearances over the past six months, you'd notice the primary goal of GPT-5 wasn't revolutionary capability gains. It was efficiency—building a "universally good model" mindful of token consumption. Solving your failing tool calls or fixing known drawbacks from GPT-4o wasn't among their highest priorities (though removing the model selector from ChatGPT was—until user revolt forced them to restore it).

The rumored upcoming Gemini version bump won't revolutionize poorly built systems either. We're in an "evolution instead of revolution" phase where each vendor has fundamentally different agendas, none primarily focused on fixing your broken workflows.

That's the hidden pattern killing AI initiatives: executives abdicate architectural decisions to vendors who have completely different incentives. There will always be a "better model" until AGI emerges (if at all), but waiting transforms you from strategic leader to passive consumer.

Decision Framework: The Model-Agnostic Test

Here is a cool exercise to do before your next AI investment. Just ask yourself these three questions:

Vendor Independence Test: If our primary AI vendor tripled prices tomorrow, how quickly could we switch without business disruption?
Reliability Baseline Test: Can our system maintain 99.9% uptime during model outages or API changes?
ROI Clarity Test: Are we measuring success by model capabilities or business outcomes?

If you can't answer confidently, you're building on quicksand.

The Practical Path: Building Antifragile AI

The solution isn't waiting for magical models. It's building systems that improve regardless of which model powers them. There is a specific, proven playbook that is clearly visible in execution of companies who master AI.

Foundation First: The 90-Day Framework

Phase 1 (Days 1-30): Data Architecture

Start with your data pipeline, not your model choice. Implement retrieval-augmented generation (RAG) to ground AI responses in your company's authoritative knowledge base. This single intervention eliminates 60-80% of hallucinations by connecting models to verifiable facts instead of letting them fabricate answers.

Phase 2 (Days 31-60): Model-Agnostic Architecture

Treat your LLM as a replaceable component. Build abstraction layers and standardized APIs that let you swap models without touching business logic. This isn't just future-proofing. It's immediate cost optimization through intelligent routing.

Smart architecture means using e.g. Gemini 2.5 Pro for complex reasoning and cheaper/faster Gemini 2.5 Flash-Lite for routine tasks. Here, 40-60% cost reductions are possible from this intelligent routing alone.

Phase 3 (Days 61-90): Continuous Evaluation

Deploy model-agnostic evaluation frameworks that measure performance on your specific tasks, not public benchmarks. Public benchmarks poorly predict real-world reliability. Models that excel at graduate-level reasoning still often fail at consistent JSON formatting in production.

Build evaluation suites that catch model drift before it impacts business outcomes. The only meaningful benchmark is continuous assessment against your actual use cases.

Delegation Guide: What to Ask Your Technical Teams

For your engineering leaders: "Show me our fallback plan if OpenAI raises API costs by 300% next quarter. How quickly can we migrate critical workflows to alternative providers?"

For your data teams: "What percentage of our AI errors stem from model limitations versus data quality issues? Show me the breakdown with specific examples."

For your AI/ML teams: "Demonstrate how our evaluation metrics align with business outcomes, not just model benchmarks. What does 'good performance' mean in revenue terms?"

Breaking Free from the Hype Cycle

Picture this conversation in your next board meeting: "We passed on upgrading to GPT-5.5 because our current architecture already delivers top-notch results and swapping the model would only result in 0.3 ROI increase. Our intelligent routing system automatically optimizes for cost and performance across five different providers. We're model-agnostic and proud of it."

That's the transformation from reactive consumer to strategic player. Instead of excitement about new releases, you feel confidence in your ability to evaluate them objectively. Imagine not being impressed by vendor demos because you know exactly how any new model would perform in your specific environment (or your team equipped with evals just needs an afternoon to find this out).

The executives winning with AI aren't chasing the latest releases. They're building reliable, model-agnostic foundations that deliver consistent quality regardless of vendor announcements. They've achieved independence from hype-driven rollouts and the weeks of chaos that follow when "that new model" breaks existing workflows.

Your competitive advantage isn't access to better models. It's building systems that get better faster than your models do. While competitors wait for GPT-5.5/6/7 to solve their fundamental architecture problems, you'll be implementing your third-generation evaluation framework and deploying AI that actually works.

Build your AI systems on foundations that last. Not new overhyped models

Work with Me →

Let's Connect

👉

Having any comments or questions about this article? Connect with me on LinkedIn to discuss