Follow ZDNET: Add us as a preferred source on Google.
ZDNET’s key takeaways
- Cloud-first approaches need to be rethought.
- AI contributes to escalating cloud costs.
- A hybrid model assures the best of both worlds.
A decade or so ago, the debate between cloud and on-premises computing raged. The cloud handily won that battle, and it wasn’t even close. Now, however, people are rethinking whether the cloud is still their best choice for many situations.
Also: Cloud-native computing is poised to explode, thanks to AI inference work
Welcome to the age of AI, in which on-premises computing is starting to look good again.
There’s a movement afoot
Existing infrastructures now configured with cloud services simply may not be ready for emerging AI demands, a recent analysis from Deloitte warned.
“The infrastructure built for cloud-first strategies can’t handle AI economics,” the report, penned by a team of Deloitte analysts led by Nicholas Merizzi, said.
Also: 5 must-have cloud tools for small businesses in 2025 (and my top 10 money-saving secrets)
“Processes designed for human workers don’t work for agents. Security models built for perimeter defense don’t protect against threats operating at machine speed. IT operating models built for service delivery don’t drive business transformation.”
To meet the needs of AI, enterprises are contemplating a shift away from mainly cloud to a hybrid mix of cloud and on-premises, according to the Deloitte analysts. Technology decision-makers are taking a second and third look at on-premises options.
Also: Want real AI ROI for business? It might finally happen in 2026 – here’s why
As the Deloitte team described it, there’s a movement afoot “from cloud-first to strategic hybrid — cloud for elasticity, on-premises for consistency, and edge for immediacy.”
Four issues
The Deloitte analysts cited four burning issues that are arising with cloud-based AI:
- Rising and unanticipated cloud costs: AI token costs have dropped 280-fold in two years, they observe — yet “some enterprises are seeing monthly bills in the tens of millions.” The overuse of cloud-based AI services “can lead to frequent API hits and escalating costs.” There’s even a tipping point in which on-premises deployments make more sense. “This may happen when cloud costs begin to exceed 60% to 70% of the total cost of acquiring equivalent on-premises systems, making capital investment more attractive than operational expenses for predictable AI workloads.”
- Latency issues with cloud: AI often demands near-zero latency to deliver actions. “Applications requiring response times of 10 milliseconds or below cannot tolerate the inherent delays of cloud-based processing,” the Deloitte authors point out.
- On-premises promises greater resiliency: Resilience is also part of the pressing requirements for fully functional AI processes. These include “mission-critical tasks that cannot be interrupted require on-premises infrastructure in case connection to the cloud is interrupted,” the analysts state.
- Data sovereignty: Some enterprises “are repatriating their computing services, not wanting to depend entirely on service providers outside their local jurisdiction.”
Also: Why some companies are backing away from the public cloud
Three-tier approach
The best solution to the cloud versus on-premises dilemma is to go with both, the Deloitte team said. They recommend a three-tier approach, which consists of the following:
- Cloud for elasticity: To handle variable training workloads, burst capacity needs, and experimentation.
- On-premises for consistency: Run production inference at predictable costs for high-volume, continuous workloads.
- Edge for immediacy: This means AI within edge devices, apps, or systems that handle “time-critical decisions with minimal latency, particularly for manufacturing and autonomous systems where split-second response times determine operational success or failure.”
This hybrid approach resonates as the best path forward for many enterprises. Milankumar Rana, who recently served as software architect at FedEx Services, is all-in with cloud for AI, but sees the need to support both approaches where appropriate.
“I have built large-scale machine learning and analytics infrastructures, and I have observed that almost all functionalities, such as data lakes, distributed pipelines, streaming analytics, and AI workloads based on GPUs and TPUs, can now run in the cloud,” he told ZDNET. “Because AWS, Azure, and GCP services are so mature, businesses may grow fast without having to spend a lot of money up front.”
Also: How AI agents can eliminate waste in your business – and why that’s smarter than cutting costs
Rana also tells customers “to maintain some workloads on-premises where data sovereignty, regulatory considerations, or very low latency make the cloud less useful,” he said. “The best way to do things right now is to use a hybrid strategy, where you keep sensitive or latency-sensitive applications on-premises while using the cloud for flexibility and new ideas.”
Whether employing cloud or on-premises systems, companies should always take direct responsibility for security and monitoring, Rana said. “Security and compliance remain the responsibility of all individuals. Cloud platforms include robust security; but, you must ensure adherence to regulations for encryption, access, and monitoring.”
