

Subscribe to get latest news!
Deep insights for enterprise AI, data, and security leaders
Orchestration
View all
Pinterest cut AI costs 90% by gutting a frontier model's vision layer

VB Event
AI agents are entering their rebuild era as enterprises confront the reliability problem
As enterprise AI agents move into production, organizations are confronting a growing reliability problem. Many teams are discovering that LLM performance alone does not determine whether agents succeed in production. Long-running AI workflows must survive crashes, preserve state, recover from failures, manage inference costs, and coordinate across APIs, tools, and enterprise systems.

AI agents are quietly generating chaos engineering failures enterprises don’t track yet
There is a category of production incident that engineering teams are not tracking yet — because it doesn't fit any existing postmortem template.

Your AI agents need a terminal, not just a vector database
Infrastructure
View all
Merck and Mastercard are seeing real agentic AI results. Both say the plumbing came first.

Resolve AI says the AI coding boom is breaking production systems. It wants to fix that.
The centerpiece of the release is a new multi-agent investigation system developed by Resolve AI's in-house research lab. Instead of deploying a single AI agent to diagnose a production failure — analogous to a lone engineer pulling an on-call shift — the platform now dispatches a coordinated team of specialized agents that pursue multiple hypotheses in parallel, independently verify each other's conclusions, and construct complete causal chains from root cause to symptom. The company says the architecture delivers more than a twofold improvement in root cause accuracy on its internal evaluation benchmarks compared to earlier versions of its platform.

Cerebras says its chips run a trillion-parameter AI model nearly 7 times faster than GPU clouds
Less than a week after completing the largest tech IPO of 2026, Cerebras Systems is making its most aggressive play yet to dominate the fast-growing AI inference market. On Monday, the Sunnyvale-based chipmaker announced that it is now running Kimi K2.6 — a trillion-parameter open-weight model developed by Beijing-based Moonshot AI — for enterprise customers at nearly 1,000 tokens per second, a speed no GPU-based provider has come close to matching.

AWS nabs white hot gen AI media creation startup fal, becoming its preferred cloud provider
Events
View allData
View all
SQL query logs hold the context AI agents need to stop hallucinating joins

Partner Content
Control within connection: How data sovereignty is rewriting the rules of critical infrastructure
Presented by Equinix

DataGrail report finds your vendor may be sending data to AI models you never approved
The data processing agreement (DPA) — the bedrock contract companies use to evaluate how vendors handle personal data — can no longer be trusted at face value. That is the central, and arguably most alarming, conclusion of DataGrail's Privacy and AI Trends Report 2026, released today.

DeepSWE blows up the AI coding leaderboard, crowns GPT-5.5, and finds Claude Opus exploiting a benchmark loophole
For months, the leading AI coding benchmarks have told enterprise buyers a comforting but misleading story: the top models are all roughly the same. OpenAI's GPT-5 family, Anthropic's Claude Opus, and Google's Gemini Pro have clustered within a narrow band on Scale AI's SWE-Bench Pro leaderboard, making it nearly impossible for engineering leaders to determine which agent will actually perform best inside their codebases.
Security
View all
DataGrail report finds your vendor may be sending data to AI models you never approved
The data processing agreement (DPA) — the bedrock contract companies use to evaluate how vendors handle personal data — can no longer be trusted at face value. That is the central, and arguably most alarming, conclusion of DataGrail's Privacy and AI Trends Report 2026, released today.

The attack dominating financial services doesn't steal passwords. It resets MFA and steals the token.

Valid certificates, stolen accounts: how attackers broke npm's last trust signal

Partner Content
Americans can’t spot a deepfake, and that’s a business crisis, not just a consumer problem
Presented by Veriff
Newsroom
View allDaversa Appoints Maggie Fair to Managing Director
FDA Grants Coredio Breakthrough Designation for AI Platform Bringing Advanced Heart Failure Assessment Beyond the Hospital
Qevlar Introduces AI Agents Unifying SOC and Vulnerability Operations as Exploitation Windows Collapse
TDK Ventures Invests in C2i Semiconductors to Revolutionize AI Data Center Power Delivery
Video
View allTechnology
View all
Anthropic's Claude Opus 4.8 is here with 3X cheaper fast mode and near-Mythos level alignment

Are designers the new SWEs? Figma Make's new two-way GitHub integration turns designs into live, production code — with built-in governance

MiniMax teases upcoming M3 model with new sparse attention mechanism and 15.6X long-context response speed boost

DeepSWE blows up the AI coding leaderboard, crowns GPT-5.5, and finds Claude Opus exploiting a benchmark loophole
For months, the leading AI coding benchmarks have told enterprise buyers a comforting but misleading story: the top models are all roughly the same. OpenAI's GPT-5 family, Anthropic's Claude Opus, and Google's Gemini Pro have clustered within a narrow band on Scale AI's SWE-Bench Pro leaderboard, making it nearly impossible for engineering leaders to determine which agent will actually perform best inside their codebases.
More

Why prompt debt, retrieval debt, and evaluation debt are quietly reshaping enterprise AI risk
Over the past two decades, technical debt meant outdated architecture, messy code, and poorly maintained documentation. That definition is no longer sufficient in the AI era, where failure modes are more subtle and often non-linear. AI systems are introducing new layers of technical debt that live across prompts, models, and data dependencies — making these layers less visible, harder to measure, and often more dangerous than traditional debt.

D&B's database of 642 million businesses was built for humans, not AI agents. So they rebuilt it.

Alibaba's proprietary Qwen3.7-Max can run for 35 hours autonomously and supports external harnesses like Anthropic's Claude Code

A 0.12% parameter add-on gives AI agents the working memory RAG can't

Enterprise AI agents keep failing because they forget what they learned

MFA verifies who logged in. It has no idea what they do next.

Kore.ai launches Artemis AI agent platform, takes on Salesforce and ServiceNow
The platform arrives at a moment when every major technology vendor — from Microsoft and Salesforce to Google and ServiceNow — is racing to become the default infrastructure for enterprise AI agents. Kore.ai's answer to that crowded field is a bet on neutrality, a proprietary intermediary language for defining agents, and a philosophy that AI, not human developers, should do most of the heavy lifting.

Partner Content
AI didn’t kill brand consistency — it made it mission-critical
Presented by Design.com




