VentureBeat | Transformative tech coverage that matters

Featured

5% GPU utilization: The $401 billion AI infrastructure problem enterprises can't keep ignoring

Enterprises panic-bought GPU capacity for two years. Real-world audits now show average utilization stuck at 5%. Rob Strechay on what the data says — and what comes next.

Rob Strechay

May 8, 2026

Miami startup Subquadratic claims 1,000x AI efficiency gain with SubQ model; researchers demand independent proof.

A little-known Miami-based startup called Subquadratic emerged from stealth on Tuesday with a sweeping claim: that it has built the first large language model to fully escape the mathematical constraint that has defined — and limited — every major AI system since 2017.

Michael Nuñez

May 5, 2026

The AI scaffolding layer is collapsing. LlamaIndex's CEO explains what survives.

Jerry Liu says 95% of LlamaIndex code is now AI-generated — and that's the point. The stack is thinning, and context is the only moat left.

Taryn Plumb

May 1, 2026

Medieval man with robot and glowing yellow ring — Credit: VentureBeat made with OpenAI ChatGPT-Images-2.0

One tool call to rule them all? New open source Python tool Runpod Flash eliminates containers for faster AI dev

With Flash GA, the company is attempting to transition from being a provider of raw compute to becoming the essential orchestration layer for the AI-first cloud.

Carl Franzen

April 30, 2026

Subscribe to get latest news!

Deep insights for enterprise AI, data, and security leaders

nuneybits Vector art of Amazon Web Services represented by a se c066ceeb-0bbb-4cee-944b-7e5e9a644fc5 — Credit: VentureBeat made with Midjourney

Amazon’s OpenAI gambit signals a new phase in the cloud wars — one where exclusivity no longer applies

The announcements, made at a live event in San Francisco titled "What's Next with AWS," landed just 24 hours after OpenAI and Microsoft publicly restructured their exclusive cloud partnership — a move that, for the first time, freed OpenAI to distribute all of its products across rival cloud providers. AWS CEO Matt Garman called it "a huge partnership" and said customers have been asking for OpenAI models inside AWS "from the very early days."

Michael Nuñez

April 29, 2026

FOMO is why enterprises pay for GPUs they don't use — and why prices keep climbing

GPU fleets are running at 5% utilization. The shortage making them expensive is the same reason enterprises won't release them.

Ujas Patel

April 29, 2026

nuneybits Vector art of burning contract 85754b2d-d02d-451f-9cd4-db57975ec2dc — Credit: VentureBeat made with Midjourney

Microsoft and OpenAI gut their exclusive deal, freeing OpenAI to sell on AWS and Google Cloud

The amended agreement, disclosed simultaneously in blog posts from both companies, marks the most significant restructuring since Microsoft first invested $1 billion in OpenAI in 2019 — and it transforms what was once the most consequential exclusive technology alliance in a generation into something that more closely resembles a strategic but arm's-length commercial relationship.

Michael Nuñez

April 27, 2026

Agent drift — CleoP made with Midjourney

Monitoring LLM behavior: Drift, retries, and refusal patterns

Traditional software is predictable: Input A plus function B always equals output C. This determinism allows engineers to develop robust tests. On the other hand, generative AI is stochastic and unpredictable. The exact same prompt often yields different results on Monday versus Tuesday, breaking the traditional unit testing that engineers know and love.

Derah Onuorah, Microsoft

April 26, 2026

Context decay, orchestration drift, and the rise of silent failures in AI systems

The most expensive AI failure I have seen in enterprise deployments did not produce an error. No alert fired. No dashboard turned red. The system was fully operational, it was just consistently, confidently wrong. That is the reliability gap. And it is the problem most enterprise AI programs are not built to catch.

Sayali Patil

April 26, 2026

nuneybits Vector art of an unplugged enterprise server glowing 938e4004-ea10-4518-817e-f1b3af13b6df — Credit: VentureBeat made with Midjourney

Google’s Gemini can now run on a single air-gapped server — and vanish when you pull the plug

The offering packages Gemini into a Dell-manufactured, Google-certified hardware appliance equipped with eight Nvidia GPUs and wrapped in confidential computing protections. Enterprises and government agencies can deploy the system inside Cirrascale's data centers or their own facilities, fully disconnected from the internet and from Google's cloud infrastructure. The product enters preview immediately, with general availability expected in June or July.

Michael Nuñez

April 22, 2026

184 VentureBeats RedHat AISummitBoston 032626 — Brian Gracely, Director of Portfolio Strategy, Red Hat and Matt Marshall, CEO and EIC, VentureBeat

Are we getting what we paid for? How to turn AI momentum into measurable value

Enterprise AI is entering a new phase — one where the central question is no longer what can be built, but how to make the most of our AI investment.

VB Staff

April 16, 2026

Partner Content

AI lowered the cost of building software. Enterprise governance hasn’t caught up

Presented by Retool

David Hsu, Retool

April 16, 2026