<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
    <channel>
        <title>DataDecisionMakers | VentureBeat</title>
        <link>https://venturebeat.com/category/datadecisionmakers/feed/</link>
        <description>Transformative tech coverage that matters</description>
        <lastBuildDate>Sat, 04 Apr 2026 07:05:26 GMT</lastBuildDate>
        <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
        <generator>https://github.com/jpmonette/feed</generator>
        <language>en</language>
        <copyright>Copyright 2026, VentureBeat</copyright>
        <item>
            <title><![CDATA[When product managers ship code: AI just broke the software org chart]]></title>
            <link>https://venturebeat.com/technology/when-product-managers-ship-code-ai-just-broke-the-software-org-chart</link>
            <guid isPermaLink="false">15UabG5u8y17NzHQ3Plcns</guid>
            <pubDate>Sun, 29 Mar 2026 16:00:00 GMT</pubDate>
            <description><![CDATA[<p>Last week, one of our product managers (PMs) built and shipped a feature. Not spec&#x27;d it. Not filed a ticket for it. Built it, tested it, and shipped it to production. In a day.</p><p>A few days earlier, our designer noticed that the visual appearance of our IDE plugins had drifted from the design system. In the old world, that meant screenshots, a JIRA ticket, a conversation to explain the intent, and a sprint slot. Instead, he opened an agent, adjusted the layout himself, experimented, iterated, and tuned in real time, then pushed the fix. The person with the strongest design intuition fixed the design directly. No translation layer required.</p><p>None of this is new in theory. Vibe coding opened the gates of software creation to millions. That was aspiration. When I <a href="https://venturebeat.com/orchestration/when-ai-turns-software-development-inside-out-170-throughput-at-80-headcount">shared the data</a> on how our engineers doubled throughput, shifted from coding to validation, brought design upfront for rapid experimentation, it was still an engineering story. What changed is that the theory became practice. Here&#x27;s how it actually played out.</p><h2>The bottleneck moved</h2><p>When we went AI-first in 2025, implementation cost collapsed. Agents took over scaffolding, tests, and the repetitive glue code that used to eat half the sprint. Cycle times dropped from weeks to days, from days to hours. Engineers started thinking less in files and functions and more in architecture, constraints, and execution plans.</p><p>But once engineering capacity stopped being the bottleneck, we noticed something: Decision velocity was. All the coordination mechanisms we&#x27;d built to protect engineering time (specs, tickets, handoffs, backlog grooming) were now the slowest part of the system. We were optimizing for a constraint that no longer existed.</p><h2>What happens when building is cheaper than coordination</h2><p>We started asking a different question: What would it look like if the people closest to the intent could ship the software directly?</p><p>PMs already think in specifications. Designers already define structure, layout, and behavior. They don&#x27;t think in syntax. They think in outcomes. When the cost of turning intent into working software dropped far enough, these roles didn&#x27;t need to &quot;learn to code.&quot; The cost of implementation simply fell to their level.</p><p>I asked one of our PMs, Dmitry, to describe what changed from his perspective. He told me: &quot;While agents are generating tasks in Zenflow, there&#x27;s a few minutes of idle time. Just dead air. I wanted to build a small game, something to interact with while you wait.&quot;</p><p>If you&#x27;ve ever run a product team, you know this kind of idea. It doesn&#x27;t move a KPI. It&#x27;s impossible to justify in a prioritization meeting. It gets deferred forever. But it adds personality. It makes the product feel like someone cared about the small details. These are exactly the things that get optimized out of every backlog grooming session, and exactly the things users remember.</p><p>He built it in a day. </p><p>In the past, that idea would have died in a prioritization spreadsheet. Not because it was bad, but because the cost of implementation made it irrational to pursue. When that cost drops to near zero, the calculus changes completely.</p><h2>Shipping became cheaper than explaining</h2><p>As more people started building directly, entire layers of process quietly vanished. Fewer tickets. Fewer handoffs. Fewer &quot;can you explain what you mean by...&quot; conversations. Fewer lost-in-translation moments.</p><p>For a meaningful class of tasks, it became faster to just build the thing than to describe what you wanted and wait for someone else to build it. Think about that for a second. Every modern software organization is structured around the assumption that implementation is the expensive part. When that assumption breaks, the org has to change with it.</p><p>Our designer fixing the plugin UI is a perfect example. The old workflow (screenshot the problem, file a ticket, explain the gap between intent and implementation, wait for a sprint slot, review the result, request adjustments) existed entirely to protect engineering bandwidth. When the person with the design intuition can act on it directly, that whole stack disappears. Not because we eliminated process for its own sake, but because the process was solving a problem that no longer existed.</p><h2>The compounding effect</h2><p>Here&#x27;s what surprised me most: It compounds.</p><p>When PMs build their own ideas, their specifications get sharper, because they now understand what the agent needs to execute well. Sharper specs produce better agent output. Better output means fewer iteration cycles. We&#x27;re seeing velocity compound week over week, not just because the models improved, but because the people using them got closer to the work.</p><p>Dmitry put it well: The feedback loop between intent and outcome went from weeks to minutes. When you can see the result of your specification immediately, you learn what precision the system needs, and you start providing it instinctively.</p><p>There&#x27;s a second-order effect that&#x27;s harder to measure but impossible to miss: Ownership. People stop waiting. They stop filing tickets for things they could just fix. &quot;Builder&quot; stopped being a job title. It became the default behavior.</p><h2>What this means for the industry</h2><p>A lot of the &quot;everyone can code&quot; narrative last year was theoretical, or focused on solo founders and tiny teams. What we experienced is different. We have ~50 engineers working in a complex brownfield codebase: Multiple surfaces and programming languages, enterprise integrations, the full weight of a real production system. </p><p>I don&#x27;t think we&#x27;re unique. I think we&#x27;re early. And with each new generation of models, the gap between who can build and who can&#x27;t is closing faster than most organizations realize. Every software company is about to discover that their PMs and designers are sitting on unrealized building capacity, blocked not by skill, but by the cost of implementation. As that cost continues to fall, the organizational implications are profound.</p><p>We started with an intent to accelerate software engineering. What we&#x27;re becoming is something different: A company where everyone ships.</p><p><i>Andrew Filev is founder and CEO of Zencoder.</i></p>]]></description>
            <category>Technology</category>
            <category>Orchestration</category>
            <category>DataDecisionMakers</category>
            <enclosure url="https://images.ctfassets.net/jdtwqhzvc2n1/H7LvPAXDoKpIKf67inh3M/627c264db2858d68eb459af3b0228f2f/Coders.png?w=300&amp;q=30" length="0" type="image/png"/>
        </item>
        <item>
            <title><![CDATA[When AI turns software development inside-out: 170% throughput at 80% headcount]]></title>
            <link>https://venturebeat.com/orchestration/when-ai-turns-software-development-inside-out-170-throughput-at-80-headcount</link>
            <guid isPermaLink="false">6myGPN3x62uQAc3CCx4bRR</guid>
            <pubDate>Sat, 28 Mar 2026 16:00:00 GMT</pubDate>
            <description><![CDATA[<p>Many people have tried AI tools and walked away unimpressed. I get it — many demos promise magic, but in practice, the results can feel underwhelming.</p><p>That’s why I want to write this not as a futurist prediction, but from lived experience. Over the past six months, I turned my engineering organization AI-first. I’ve shared before about the system behind that transformation — how we built the workflows, the metrics, and the guardrails. Today, I want to zoom out from the mechanics and talk about what I’ve <i>learned</i> from that experience — about where our profession is heading when software development itself turns inside out. </p><p>Before I do, a couple of numbers to illustrate the scale of change. Subjectively, it feels that we are moving twice as fast. Objectively, here’s how the throughput evolved. Our total engineering team headcount floated from 36 at the beginning of the year to 30. So you get ~170% throughput on ~80% headcount, which matches the subjective ~2x. </p><p>Zooming in, I picked a couple of our senior engineers who started the year in a more traditional software engineering process and ended it in the AI-first way. [The dips correspond to vacations and off-sites]:</p><p>Note that our PRs are tied to JIRA tickets, and the average scope of those tickets didn’t change much through the year, so it’s as good a proxy as the data can give us. </p><p>Qualitatively, looking at the business value, I actually see even higher uplift. One reason is that, as we started last year, our quality assurance (QA) team couldn’t keep up with our engineers&#x27; velocity. As the company leader, I wasn’t happy with the quality of some of our early releases. As we progressed through the year, and tooled our <a href="https://venturebeat.com/orchestration/testing-autonomous-agents-or-how-i-learned-to-stop-worrying-and-embrace?_gl=1*anv509*_up*MQ..*_ga*MjYwMjA0MTU5LjE3NzQ3MTE4NDI.*_ga_SCH1J7LNKY*czE3NzQ3MTE4NDEkbzEkZzAkdDE3NzQ3MTE4NDEkajYwJGwwJGgw*_ga_B8TDS1LEXQ*czE3NzQ3MTE4NDEkbzEkZzAkdDE3NzQ3MTE4NDEkajYwJGwwJGgw">AI workflows</a> to include writing unit and end-to-end tests, our coverage improved, the number of bugs dropped, users became fans, and the business value of engineering work multiplied.</p><h3><b>From big design to rapid experimentation</b></h3><p>Before AI, we spent weeks perfecting user flows before writing code. It made sense when change was expensive. Agile helped, but even then, testing multiple product ideas was too costly.</p><p>Once we went AI-first, that trade-off disappeared. The cost of <i>experimentation</i> collapsed. An idea could go from whiteboard to a working prototype in a day: From idea to AI-generated product requirements document (PRD), to AI-generated tech spec, to AI-assisted implementation. </p><p>It manifested itself in some amazing transformations. Our website—central to our acquisition and inbound demand—is now a product-scale system with hundreds of custom components, all designed, developed, and maintained directly in code by our <a href="https://venturebeat.com/technology/you-thought-the-generalist-was-dead-in-the-vibe-work-era-theyre-more?_gl=1*ce62no*_up*MQ..*_ga*MjYwMjA0MTU5LjE3NzQ3MTE4NDI.*_ga_SCH1J7LNKY*czE3NzQ3MTE4NDEkbzEkZzAkdDE3NzQ3MTE4NDEkajYwJGwwJGgw*_ga_B8TDS1LEXQ*czE3NzQ3MTE4NDEkbzEkZzAkdDE3NzQ3MTE4NDEkajYwJGwwJGgw">creative director</a>. </p><p>Now, instead of validating with slides or static prototypes, we validate with working products. We test ideas live, learn faster, and release major updates every other month, a pace I couldn’t imagine three years ago.</p><p>For example, Zen CLI was first written in Kotlin, but then we changed our mind and moved it to TypeScript with no release velocity lost.</p><p>Instead of mocking the features, our UX designers and project managers vibe code them. And when the release-time crunch hit everyone, they jumped into action and fixed dozens of small details with production-ready PRs to help us ship a great product. This included an overnight UI layout change.</p><h3><b>From coding to validation</b></h3><p>The next shift came where I least expected it: Validation.</p><p>In a traditional org, most people write code and a smaller group tests it. But when AI generates much of the implementation, the leverage point moves. The real value lies in defining what “good” looks like — in making correctness explicit.</p><p>We support 70-plus programming languages and countless integrations. Our QA engineers have evolved into system architects. They build AI agents that generate and maintain acceptance tests directly from requirements. And those agents are embedded into the codified AI workflows that allow us to achieve predictable engineering outcomes by using a system.</p><p>This is what “shift left” really means. Validation isn’t a stand-alone function, it’s an integral part of the production process. If the agent can’t validate it’s work, it can’t be trusted to generate production code. For QA professionals, this is a moment of reinvention, where, with the right upskilling, their work becomes a critical enabler and accelerator of the <a href="https://venturebeat.com/technology/fixing-ai-failure-three-changes-enterprises-should-make-now?_gl=1*fuqkoe*_up*MQ..*_ga*MjYwMjA0MTU5LjE3NzQ3MTE4NDI.*_ga_SCH1J7LNKY*czE3NzQ3MTE4NDEkbzEkZzAkdDE3NzQ3MTE4NDEkajYwJGwwJGgw*_ga_B8TDS1LEXQ*czE3NzQ3MTE4NDEkbzEkZzAkdDE3NzQ3MTE4NDEkajYwJGwwJGgw">AI adoption</a>. </p><p>Product managers, tech leads, and data engineers now share this responsibility as well, because defining correctness has become a cross-functional skill, not a role confined to QA.</p><h3><b>From diamond to double funnel</b></h3><p>For decades, software development followed a “diamond” shape: A small product team handed off to a large engineering team, then narrowed again through QA.</p><p>Today, that geometry is flipping. Humans engage more deeply at the beginning — defining intent, exploring options — and again at the end, validating outcomes. The middle, where AI executes, is faster and narrower.</p><p>It’s not just a new workflow; it’s a structural inversion.</p><p>The model looks less like an assembly line and more like a control tower. Humans set direction and constraints, AI handles execution at speed, and people step back in to validate outcomes before decisions land in production.</p><h3><b>Engineering at a higher level of abstraction</b></h3><p>Every major leap in software raised our level of abstraction — from punch cards to high-level programming languages, from hardware to cloud. AI is the next step. Our engineers now work at a meta-layer: Orchestrating AI workflows, tuning agentic instructions and skills, and defining guardrails. The machines build; the humans decide <i>what</i> and <i>why</i>.</p><p>Teams now routinely decide when AI output is safe to merge without review, how tightly to bound agent autonomy in production systems, and what signals actually indicate correctness at scale, decisions that simply didn’t exist before.</p><p>And that’s the paradox of AI-first engineering — it feels less like coding, and more like thinking. Welcome to the new era of human intelligence, powered by AI.</p><p><i>Andrew Filev is founder and CEO of Zencoder</i> </p>]]></description>
            <category>Orchestration</category>
            <category>Technology</category>
            <category>DataDecisionMakers</category>
            <enclosure url="https://images.ctfassets.net/jdtwqhzvc2n1/5D3kaGN3o7ByuKcDmTtiCD/ba86d199b3002a0cee9b7e5b7caea173/Inside-out.webp?w=300&amp;q=30" length="0" type="image/webp"/>
        </item>
        <item>
            <title><![CDATA[You thought the generalist was dead — in the 'vibe work' era, they're more important than ever]]></title>
            <link>https://venturebeat.com/technology/you-thought-the-generalist-was-dead-in-the-vibe-work-era-theyre-more</link>
            <guid isPermaLink="false">2FcQNYjZMUdGCKDvKD4aCX</guid>
            <pubDate>Mon, 23 Mar 2026 19:00:00 GMT</pubDate>
            <description><![CDATA[<p>Not long ago, the idea of being a “generalist” in the workplace had a mixed reputation. The stereotype was the “jack of all trades” who could dabble in many disciplines but was a “master of none.” And for years, that was more or less true. </p><p>Most people simply didn’t have access to the expertise required to do highly cross-functional work. If you needed a new graphic, you waited for a designer. If you needed to change a contract, you waited for legal. In smaller organizations and startups, this waiting game was typically replaced with inaction or improvization — often with questionable results.</p><p>AI is changing this faster than any technology shift I’ve seen. It’s allowing people to succeed at tasks beyond their normal area of expertise.</p><p><a href="https://www.anthropic.com/research/how-ai-is-transforming-work-at-anthropic">Anthropic found</a> that AI is “enabling engineers to become more full-stack in their work,” meaning they’re able to make competent decisions across a much wider range of interconnected technologies. A direct consequence of this is tasks that would have been left aside due to lack of time or expertise are now being accomplished (27% of AI-assisted work per Anthropic&#x27;s study). 

This shift is closely mirroring the effects of past <a href="https://venturebeat.com/technology/rethinking-aeo-when-software-agents-navigate-the-web-on-behalf-of-users">revolutionary technologies</a>. The invention of the automobile or the computer did not bring us a wealth of leisure time — it mainly led us to start doing work that could not be done before.</p><p>With AI as a guide, anyone can now expand their skillsets and augment their expertise to accomplish more. This fundamentally changes what people can do, who can do it, how teams operate, <i>and</i> what leaders should expect. </p><p>Well, not so fast. </p><p>The AI advances have been incredible, and if 2025 may not have fully delivered its promise of bringing <a href="https://blog.samaltman.com/reflections"><u>AI agents to the workforce</u></a>, there’s no reason to doubt it’s well on its way. But for now, it’s not perfect. If to err is human, to trust AI not to err is foolish.</p><p>One of the biggest challenges of working with AI is identifying hallucinations. The term was coined, I assume, not as a cute way to refer to factual errors, but as quite an apt way of describing the conviction that AI exhibits in its erroneous answers. We humans have a clear bias toward confident people, which probably explains the number of <a href="https://www.theguardian.com/us-news/2025/may/31/utah-lawyer-chatgpt-ai-court-brief"><u>smart people getting burned</u></a> after taking ChatGPT at face value. </p><p>And if experts can get fooled by an overconfident AI, how can generalists hope to harness the power of AI without making the same mistake? </p><h2><b>Citizen guardrails give way to vibe freedom</b></h2><p>It’s tempting to compare today’s <a href="https://venturebeat.com/orchestration/vibe-coding-with-overeager-ai-lessons-learned-from-treating-google-ai-studio">AI vibe coding wave</a> to the rise of low- and no-code tools. No-code tools gave users freedom to build custom software tailored to their needs. However, the comparison doesn’t quite hold. The so-called “citizen developers” could only operate inside the boundaries the tool allowed. These tight constraints were limiting, but they had the benefit of saving the users from themselves — preventing anything catastrophic.</p><p>AI removes those boundaries almost entirely, and with great freedom comes responsibilities that most people aren’t quite prepared for. </p><p>The first stage of &#x27;vibe freedom&#x27; is one of unbridled optimism encouraged by a sycophantic AI. “You’re absolutely correct!” The dreaded report that would have taken all night looks better than anything you could have done yourself and only took a few minutes.

The next stage comes almost by surprise — there’s something that’s not quite right. You start doubting the accuracy of the work — you review and then wonder if it wouldn’t have been quicker to just do it yourself in the first place.</p><p>Then comes bargaining and acceptance. You argue with the AI, you’re led down confusing paths, but slowly you start developing an understanding — a mental model of the AI mind. You learn to recognize the confidently incorrect, you learn to push back and cross-check, you learn to trust and verify. </p><h2><b>The generalist becomes the trust layer</b></h2><p>This is a skill that can be learned, and it can only be learned on the job, through regular practice. This doesn’t require deep specialization, but it does require awareness. Curiosity becomes essential. So does the willingness to learn quickly, think critically, spot inconsistencies, and to rely on judgment rather than <a href="https://venturebeat.com/security/when-ai-lies-the-rise-of-alignment-faking-in-autonomous-systems">treating AI as infallible</a>.</p><p>That’s the new job of the generalist: Not to be an expert in everything, but to understand the AI mind enough to catch when something is off, and to defer to a true specialist when the stakes are high. </p><p>The generalist becomes the human trust layer sitting between the AI’s output and the organization’s standards. They decide what passes and what gets a second opinion.</p><p>That said, this only works if the generalist clears a minimum bar of fluency. There’s a big difference between “broadly informed” and “confidently unaware.” AI makes that gap easier to miss.</p><h2><b>Impact on teams and hiring</b></h2><p>Clearly, specialists will not be replaced by AI anytime soon. Their work remains critical. It will evolve to  become more strategic.</p><p>What AI changes is everything around the edges. Roles that felt important but were hard to fill, tasks that sat in limbo because no expert was available, backlogs created by waiting for highly skilled people to review simple work. Now, a generalist can get much farther on their own, and specialists can focus on the hardest problems. </p><p>We’re already starting to see an impact in the hiring landscape. Companies are looking to bring on individuals who are comfortable navigating AI. People who embrace it and use it to take on projects outside of their comfort zone.</p><p>Performance expectations will shift too. Many leaders are already looking less at productivity alone, and more at how effectively someone uses AI. We see token usage not as a measure of cost, but as an indicator of AI adoption, and perhaps optimistically, as a proxy for productivity.   </p><h2><b>Making vibe work viable</b></h2><ol><li><p><b>Use AI to enhance work, not to wing it: </b>You will get burned letting AI loose. It requires guidance and oversight.</p></li><li><p><b>Learn when to trust and when to verify: </b>Build an understanding of the AI mind so you can exercise good judgement on the work produced. When in doubt or when the stakes are high, defer to specialists.</p></li><li><p><b>Set c</b><b>lear organizational standards: </b>AI thrives on context and humans, too. Invest in documentation of processes, procedures, and best practices.</p></li><li><p><b>Keep humans in the loop: </b>AI shouldn’t remove oversight. It should make oversight easier.</p></li></ol><p>Without these factors, AI work stays in the “vibe” stage. With them, it becomes something the business can actually rely on.</p><h2><b>Return of the generalist</b></h2><p>The emerging, AI-empowered generalist is defined by curiosity, adaptability, and the ability to evaluate the work AI produces. They can span multiple functions, not because they’re experts in each one, but because AI gives them access to specialist-level expertise. Most importantly, this new generation of generalists knows when and how to apply their human judgment and critical thinking. That’s the real determining factor for turning vibes into something reliable, sustainable, and viable in the long run.</p><p><i>Cedric Savarese is founder and CEO of FormAssembly. </i></p>]]></description>
            <category>Technology</category>
            <category>DataDecisionMakers</category>
            <enclosure url="https://images.ctfassets.net/jdtwqhzvc2n1/3xrlIxdF4stMunUA8DvaqF/4e2d7737e1d037c7a8d8ae230481ceab/AI_generalist.png?w=300&amp;q=30" length="0" type="image/png"/>
        </item>
        <item>
            <title><![CDATA[Testing autonomous agents (Or: how I learned to stop worrying and embrace chaos)]]></title>
            <link>https://venturebeat.com/orchestration/testing-autonomous-agents-or-how-i-learned-to-stop-worrying-and-embrace</link>
            <guid isPermaLink="false">4rqG5BU0wTiPRri884nAhi</guid>
            <pubDate>Mon, 23 Mar 2026 16:00:00 GMT</pubDate>
            <description><![CDATA[<p>Look, we&#x27;ve spent the last 18 months building production AI systems, and we&#x27;ll tell you what keeps us up at night — and it&#x27;s not whether the model can answer questions. That&#x27;s table stakes now. What haunts us is the mental image of an agent autonomously approving a six-figure vendor contract at 2 a.m. because someone typo&#x27;d a config file.</p><p>We&#x27;ve moved past the era of &quot;ChatGPT wrappers&quot; (thank God), but the industry still treats autonomous agents like they&#x27;re just chatbots with API access. They&#x27;re not. When you give an AI system the <a href="https://venturebeat.com/technology/rethinking-aeo-when-software-agents-navigate-the-web-on-behalf-of-users">ability to take actions</a> without human confirmation, you&#x27;re crossing a fundamental threshold. You&#x27;re not building a helpful assistant anymore — you&#x27;re building something closer to an employee. And that changes everything about how we need to engineer these systems.</p><h2><b>The autonomy problem nobody talks about</b></h2><p>Here&#x27;s what&#x27;s wild: We&#x27;ve gotten really good at making models that *sound* confident. But confidence and reliability aren&#x27;t the same thing, and the gap between them is where production systems go to die.</p><p>We <!-- -->learned this the hard way during a pilot program where we let an AI agent manage calendar scheduling across executive teams. Seems simple, right? The agent could check availability, send invites, handle conflicts. Except, one Monday morning, it rescheduled a board meeting because it interpreted &quot;let&#x27;s push this if we need to&quot; in a Slack message as an actual directive. The model wasn&#x27;t wrong in its interpretation — it was plausible. But plausible isn&#x27;t good enough when you&#x27;re dealing with autonomy.</p><p>That incident taught us something crucial: The challenge isn&#x27;t <a href="https://venturebeat.com/technology/fixing-ai-failure-three-changes-enterprises-should-make-now">building agents that work</a> most of the time. It&#x27;s building agents that fail gracefully, know their limitations, and have the circuit breakers to prevent catastrophic mistakes.</p><h2><b>What reliability actually means for autonomous systems</b></h2><p><i>Layered reliability architecture</i></p><p>When we talk about reliability in traditional software engineering, we&#x27;ve got decades of patterns: Redundancy, retries, idempotency, graceful degradation. But AI agents break a lot of our assumptions.</p><p>Traditional software fails in predictable ways. You can write unit tests. You can trace execution paths. With <a href="https://venturebeat.com/technology/dynamic-ui-for-dynamic-ai-inside-the-emerging-a2ui-model">AI agents</a>, you&#x27;re dealing with probabilistic systems making judgment calls. A bug isn&#x27;t just a logic error—it&#x27;s the model hallucinating a plausible-sounding but completely fabricated API endpoint, or misinterpreting context in a way that technically parses but completely misses the human intent.</p><p>So what does reliability look like here? In our experience, it&#x27;s a layered approach.</p><p><b>Layer 1: Model selection and prompt engineering</b></p><p>This is foundational but insufficient. Yes, use the best model you can afford. Yes, craft your prompts carefully with examples and constraints. But don&#x27;t fool yourself into thinking that a great prompt is enough. I&#x27;ve seen too many teams ship &quot;GPT-4 with a really good system prompt&quot; and call it enterprise-ready.</p><p><b>Layer 2: Deterministic guardrails</b></p><p>Before the model does anything irreversible, run it through hard checks. Is it trying to access a resource it shouldn&#x27;t? Is the action within acceptable parameters? We&#x27;re talking old-school validation logic — regex, schema validation, allowlists. It&#x27;s not sexy, but it&#x27;s effective.</p><p>One pattern that&#x27;s worked well for us: Maintain a formal action schema. Every action an agent can take has a defined structure, required fields, and validation rules. The agent proposes actions in this schema, and we validate before execution. If validation fails, we don&#x27;t just block it — we feed the validation errors back to the agent and let it try again with context about what went wrong.</p><p><b>Layer 3: Confidence and uncertainty quantification</b></p><p>Here&#x27;s where it gets interesting. We need agents that know what they don&#x27;t know. We&#x27;ve been experimenting with agents that can explicitly reason about their confidence before taking actions. Not just a probability score, but actual articulated uncertainty: &quot;I&#x27;m interpreting this email as a request to delay the project, but the phrasing is ambiguous and could also mean...&quot;</p><p>This doesn&#x27;t prevent all mistakes, but it creates natural breakpoints where you can inject human oversight. High-confidence actions go through automatically. Medium-confidence actions get flagged for review. Low-confidence actions get blocked with an explanation.</p><p><b>Layer 4: Observability and auditability</b></p><p><i>Action Validation Pipeline</i> </p><p>If you can&#x27;t debug it, you can&#x27;t trust it. Every decision the agent makes needs to be loggable, traceable, and explainable. Not just &quot;what action did it take&quot; but &quot;what was it thinking, what data did it consider, what was the reasoning chain?&quot;</p><p>We&#x27;ve built a custom logging system that captures the full large language model (LLM) interaction — the prompt, the response, the context window, even the model temperature settings. It&#x27;s verbose as hell, but when something goes wrong (and it will), you need to be able to reconstruct exactly what happened. Plus, this becomes your dataset for fine-tuning and improvement.</p><h2><b>Guardrails: The art of saying no</b></h2><p>Let&#x27;s talk about guardrails, because this is where engineering discipline really matters. A lot of teams approach guardrails as an afterthought — &quot;we&#x27;ll add some safety checks if we need them.&quot; That&#x27;s backwards. Guardrails should be your starting point.</p><p>We<!-- --> think of guardrails in three categories.</p><p><b>Permission boundaries</b></p><p>What is the agent physically allowed to do? This is your blast radius control. Even if the agent hallucinates the worst possible action, what&#x27;s the maximum damage it can cause?</p><p>We use a principle called &quot;graduated autonomy.&quot; New agents start with read-only access. As they prove reliable, they graduate to low-risk writes (creating calendar events, sending internal messages). High-risk actions (financial transactions, external communications, data deletion) either require explicit human approval or are simply off-limits.</p><p>One technique that&#x27;s worked well: Action cost budgets. Each agent has a daily &quot;budget&quot; denominated in some unit of risk or cost. Reading a database record costs 1 unit. Sending an email costs 10. Initiating a vendor payment costs 1,000. The agent can operate autonomously until it exhausts its budget; then, it needs human intervention. This creates a natural throttle on potentially problematic behavior.</p><p><i>Graduated Autonomy and Action Cost Budget</i> </p><p><b>Semantic Houndaries</b></p><p>What should the agent understand as in-scope vs out-of-scope? This is trickier because it&#x27;s conceptual, not just technical.</p><p>I&#x27;ve found that explicit domain definitions help a lot. Our customer service agent has a clear mandate: handle product questions, process returns, escalate complaints. Anything outside that domain — someone asking for investment advice, technical support for third-party products, personal favors — gets a polite deflection and escalation.</p><p>The challenge is making these boundaries robust to prompt injection and jailbreaking attempts. Users will try to convince the agent to help with out-of-scope requests. Other parts of the system might inadvertently pass instructions that override the agent&#x27;s boundaries. You need multiple layers of defense here.</p><p><b>Operational boundaries</b></p><p>How much can the agent do, and how fast? This is your rate limiting and resource control.</p><p>We&#x27;ve implemented hard limits on everything: API calls per minute, maximum tokens per interaction, maximum cost per day, maximum number of retries before human escalation. These might seem like artificial constraints, but they&#x27;re essential for preventing runaway behavior.</p><p>We<!-- --> once saw an agent get stuck in a loop trying to resolve a scheduling conflict. It kept proposing times, getting rejections, and trying again. Without rate limits, it sent 300 calendar invites in an hour. With proper operational boundaries, it would&#x27;ve hit a threshold and escalated to a human after attempt number 5.</p><h2>Agents need their own style of testing</h2><p>Traditional software testing doesn&#x27;t cut it for autonomous agents. You can&#x27;t just write test cases that cover all the edge cases, because with LLMs, everything is an edge case.</p><p>What&#x27;s worked for us:</p><p><b>Simulation environments</b></p><p>Build a sandbox that mirrors production but with fake data and mock services. Let the agent run wild. See what breaks. We do this continuously — every code change goes through 100 simulated scenarios before it touches production.</p><p>The key is making scenarios realistic. Don&#x27;t just test happy paths. Simulate angry customers, ambiguous requests, contradictory information, system outages. Throw in some adversarial examples. If your agent can&#x27;t handle a test environment where things go wrong, it definitely can&#x27;t handle production.</p><p><b>Red teaming</b></p><p>Get creative people to try to break your agent. Not just security researchers, but domain experts who understand the business logic. Some of our best improvements came from sales team members who tried to &quot;trick&quot; the agent into doing things it shouldn&#x27;t.</p><p><b>Shadow mode</b></p><p>Before you go live, run the agent in shadow mode alongside humans. The agent makes decisions, but humans actually execute the actions. You log both the agent&#x27;s choices and the human&#x27;s choices, and you analyze the delta.</p><p>This is painful and slow, but it&#x27;s worth it. You&#x27;ll find all kinds of subtle misalignments you&#x27;d never catch in testing. Maybe the agent technically gets the right answer, but with phrasing that violates company tone guidelines. Maybe it makes legally correct but ethically questionable decisions. Shadow mode surfaces these issues before they become real problems.</p><h2><b>The human-in-the-loop pattern</b></h2><p><i>Three Human-in-the-Loop Patterns</i> </p><p>Despite all the automation, humans remain essential. The question is: Where in the loop?</p><p>We&#x27;re increasingly convinced that &quot;human-in-the-loop&quot; is actually several distinct patterns:</p><p><b>Human-on-the-loop</b>: The agent operates autonomously, but humans monitor dashboards and can intervene. This is your steady-state for well-understood, low-risk operations.</p><p><b>Human-in-the-loop</b>: The agent proposes actions, humans approve them. This is your training wheels mode while the agent proves itself, and your permanent mode for high-risk operations.</p><p><b>Human-with-the-loop</b>: Agent and human collaborate in real-time, each handling the parts they&#x27;re better at. The agent does the grunt work, the human does the judgment calls.</p><p>The trick is making these transitions smooth. An agent shouldn&#x27;t feel like a completely different system when you move from autonomous to supervised mode. Interfaces, logging, and escalation paths should all be consistent.</p><h2><b>Failure modes and recovery</b></h2><p>Let&#x27;s be honest: Your agent will fail. The question is whether it fails gracefully or catastrophically.</p><p>We<!-- --> classify failures into three categories:</p><p><b>Recoverable errors</b>: The agent tries to do something, it doesn&#x27;t work, the agent realizes it didn&#x27;t work and tries something else. This is fine. This is how complex systems operate. As long as the agent isn&#x27;t making things worse, let it retry with exponential backoff.</p><p><b>Detectable failures</b>: The agent does something wrong, but monitoring systems catch it before significant damage occurs. This is where your guardrails and observability pay off. The agent gets rolled back, humans investigate, you patch the issue.</p><p><b>Undetectable failures</b>: The agent does something wrong, and nobody notices until much later. These are the scary ones. Maybe it&#x27;s been misinterpreting customer requests for weeks. Maybe it&#x27;s been making subtly incorrect data entries. These accumulate into systemic issues.</p><p>The defense against undetectable failures is regular auditing. We randomly sample agent actions and have humans review them. Not just pass/fail, but detailed analysis. Is the agent showing any drift in behavior? Are there patterns in its mistakes? Is it developing any concerning tendencies?</p><h2><b>The cost-performance tradeoff</b></h2><p>Here&#x27;s something nobody talks about enough: reliability is expensive.</p><p>Every guardrail adds latency. Every validation step costs compute. Multiple model calls for confidence checking multiply your API costs. Comprehensive logging generates massive data volumes.</p><p>You have to be strategic about where you invest. Not every agent needs the same level of reliability. A marketing copy generator can be looser than a financial transaction processor. A scheduling assistant can retry more liberally than a code deployment system.</p><p>We use a risk-based approach. High-risk agents get all the safeguards, multiple validation layers, extensive monitoring. Lower-risk agents get lighter-weight protections. The key is being explicit about these trade-offs and documenting why each agent has the guardrails it does.</p><h2><b>Organizational challenges</b></h2><p>We&#x27;d<!-- --> be remiss if we didn&#x27;t mention that the hardest parts aren&#x27;t technical — they&#x27;re organizational.</p><p>Who owns the agent when it makes a mistake? Is it the engineering team that built it? The business unit that deployed it? The person who was supposed to be supervising it?</p><p>How do you handle edge cases where the agent&#x27;s logic is technically correct but contextually inappropriate? If the agent follows its rules but violates an unwritten norm, who&#x27;s at fault?</p><p>What&#x27;s your incident response process when an agent goes rogue? Traditional runbooks assume human operators making mistakes. How do you adapt these for autonomous systems?</p><p>These questions don&#x27;t have universal answers, but they need to be addressed before you deploy. Clear ownership, documented escalation paths, and well-defined success metrics are just as important as the technical architecture.</p><h2><b>Where we go from here</b></h2><p>The industry is still figuring this out. There&#x27;s no established playbook for building reliable autonomous agents. We&#x27;re all learning in production, and that&#x27;s both exciting and terrifying.</p><p>What we know for sure: The teams that succeed will be the ones who treat this as an engineering discipline, not just an AI problem. You need traditional software engineering rigor — testing, monitoring, incident response — combined with new techniques specific to probabilistic systems.</p><p>You need to be paranoid but not paralyzed. Yes, autonomous agents can fail in spectacular ways. But with proper guardrails, they can also handle enormous workloads with superhuman consistency. The key is respecting the risks while embracing the possibilities.</p><p>We&#x27;ll leave you with this: Every time we deploy a new autonomous capability, we run a pre-mortem. We imagine it&#x27;s six months from now and the agent has caused a significant incident. What happened? What warning signs did we miss? What guardrails failed?</p><p>This exercise has saved us more times than we can count. It forces you to think through failure modes before they occur, to build defenses before you need them, to question assumptions before they bite you.</p><p>Because in the end, building enterprise-grade autonomous AI agents isn&#x27;t about making systems that work perfectly. It&#x27;s about making systems that fail safely, recover gracefully, and learn continuously.</p><p>And that&#x27;s the kind of engineering that actually matters.</p><p><i>Madhvesh Kumar is a principal engineer. Deepika Singh is a senior software engineer. </i></p><p><i>Views expressed are based on hands-on experience building and deploying autonomous agents, along with the occasional 3 AM incident response that makes you question your career choices.</i></p>]]></description>
            <category>Orchestration</category>
            <category>Infrastructure</category>
            <category>DataDecisionMakers</category>
            <enclosure url="https://images.ctfassets.net/jdtwqhzvc2n1/6CPQpqNcn1BPPtaMHhAWZI/aad349eac9a74697393eba5bdb329d94/Agent_chaos.png?w=300&amp;q=30" length="0" type="image/png"/>
        </item>
        <item>
            <title><![CDATA[Rethinking AEO when software agents navigate the web on behalf of users]]></title>
            <link>https://venturebeat.com/technology/rethinking-aeo-when-software-agents-navigate-the-web-on-behalf-of-users</link>
            <guid isPermaLink="false">5D6VF9nQL7SKodEaaDpc19</guid>
            <pubDate>Mon, 16 Mar 2026 05:00:00 GMT</pubDate>
            <description><![CDATA[<p>For more than two decades, digital businesses have relied on a simple assumption: When someone interacts with a website, that activity reflects a human making a conscious choice. Clicks are treated as signals of interest. Time on page is assumed to indicate engagement. Movement through a funnel is interpreted as intent. Entire growth strategies, marketing budgets, and product decisions have been built on this premise.</p><p>Today, that assumption is quietly beginning to erode.</p><p>As <a href="https://venturebeat.com/technology/fixing-ai-failure-three-changes-enterprises-should-make-now">AI-powered tools</a> increasingly interact with the web on behalf of users, many of the signals organizations depend on are becoming harder to interpret. The data itself is still accurate — pages are viewed, buttons are clicked, actions are recorded — but the meaning behind those actions is changing. This shift isn’t theoretical or limited to edge cases. It’s already influencing how leaders read dashboards, forecast demand, and evaluate performance.</p><p>The challenge ahead isn’t stopping AI-driven interactions. It’s learning how to interpret digital behavior in a world where human and automated activity increasingly overlap.</p><h3><b>A changing assumption about web traffic</b></h3><p>For decades, the foundation of the internet rested on a quiet, human-centric model. Behind every scroll, form submission, or purchase flow was a person acting out of curiosity, need, or intent. Analytics platforms evolved to capture these behaviors. Security systems focused on separating “legitimate users” from clearly scripted automation. Even digital advertising economics assumed that engagement equaled human attention.</p><p>Over the last few years, that model has begun to shift. <a href="https://venturebeat.com/technology/dynamic-ui-for-dynamic-ai-inside-the-emerging-a2ui-model">Advances in large language models</a> (LLMs), browser automation, and AI-driven agents have made it possible for software systems to navigate the web in ways that feel fluid and context-aware. Pages are explored, options are compared, workflows are completed — often without obvious signs of automation.</p><p>This doesn’t mean the web is becoming less human. Instead, it’s becoming more hybrid. AI systems are increasingly embedded in everyday workflows, acting as research assistants, comparison tools, or task completers on behalf of people. As a result, the line between a human interacting directly with a site and software acting for them is becoming less distinct.</p><p>The challenge isn’t automation itself. It’s the ambiguity this overlap introduces into the signals businesses rely on.</p><h3><b>What do we mean by AI-generated traffic?</b></h3><p>When people hear “automated traffic,” they often think of the bots of the past — rigid scripts that followed predefined paths and broke the moment an interface changed. Those systems were repetitive, predictable, and relatively easy to identify.</p><p>AI-generated traffic is different.</p><p><a href="https://venturebeat.com/technology/the-limits-of-bubble-thinking-how-ai-breaks-every-historical-analogy">Modern AI agents</a> combine machine learning (ML) with automated browsing capabilities. They can interpret page layouts, adapt to interface changes, and complete multi-step tasks. In many cases, language models guide decision-making, allowing these systems to adjust behavior based on context rather than fixed rules. The result is interaction that appears far more natural than earlier automation.</p><p>Importantly, this kind of traffic is not inherently problematic. Automation has long played a productive role on the web, from search indexing and accessibility tools to testing frameworks and integrations. Newer AI agents simply extend this evolution — helping users summarize content, compare products, or gather information across multiple sites.</p><p>The issue is not intent, but interpretation. When AI agents interact with a site successfully on behalf of users, traditional engagement metrics may no longer reflect the same meaning they once did.</p><h3><b>Why AI-generated traffic is becoming harder to distinguish</b></h3><p>Historically, detecting automated activity relied on spotting technical irregularities. Systems flagged behavior that moved too fast, followed perfectly consistent paths, or lacked standard browser features. Automation exposed “tells” that made classification straightforward.</p><p><a href="https://venturebeat.com/technology/karpathys-march-of-nines-shows-why-90-ai-reliability-isnt-even-close-to">AI-driven systems</a> change this dynamic. They operate through standard browsers. They pause, scroll, and navigate non-linearly. They vary timing and interaction sequences. Because these agents are designed to interact with the web as it was built — for humans — their behavior increasingly blends into normal usage patterns.</p><p>As a result, the challenge shifts from identifying errors to interpreting behavior. The question becomes less about <i>whether</i> an interaction is automated and more about <i>how</i> it unfolds over time. Many of the signals that once separated humans from software are converging, making binary classification less effective.</p><h3><b>When engagement stops meaning what we think</b></h3><p>Consider a common e-commerce scenario.</p><p>A retail team notices a sustained increase in product views and “add to cart” actions. Historically, this would be a clear signal of growing demand, prompting increased ad spend or inventory expansion.</p><p>Now imagine that a portion of this activity is generated by AI agents performing price monitoring or product comparison on behalf of users. The interactions occurred. The metrics are accurate. But the underlying intent is different. The funnel no longer represents a straightforward path toward purchase.</p><p>Nothing is “wrong” with the data — but the meaning has shifted.</p><p>Similar patterns are appearing across industries:</p><ul><li><p>Digital publishers see spikes in article engagement without corresponding ad revenue.</p></li><li><p>SaaS companies observe heavy feature exploration with limited conversion.</p></li><li><p>Travel platforms record increased search activity that doesn’t translate into bookings.</p></li></ul><p>In each case, organizations risk optimizing for activity rather than value.</p><h3><b>Why this is a data and analytics problem</b></h3><p>At its core, AI-generated traffic introduces ambiguity into the assumptions underlying analytics and modeling. Many systems assume that observed behavior maps cleanly to human intent. When automated interactions are mixed into datasets, that assumption weakens.</p><p>Behavioral data may now include:</p><ul><li><p>Exploration without purchase intent</p></li><li><p>Research-driven navigation</p></li><li><p>Task completion without conversion</p></li><li><p>Repeated patterns driven by automation goals</p></li></ul><p>For analytics teams, this introduces noise into labels, weakens proxy metrics, and increases the risk of feedback loops. Models trained on mixed signals may learn to optimize for volume rather than outcomes that matter to the business.</p><p>This doesn’t invalidate analytics. It raises the bar for interpretation.</p><h3><b>Data integrity in a machine-to-machine world</b></h3><p>As behavioral data increasingly feeds ML systems that shape user experience, the composition of that data matters. If a growing share of interactions comes from automated agents, platforms may begin to optimize for machine navigation rather than human experience.</p><p>Over time, this can subtly reshape the web. Interfaces may become efficient for extraction and summarization while losing the irregularities that make them intuitive or engaging for people. Preserving a meaningful human signal requires moving beyond raw volume and focusing on interaction context.</p><h3><b>From exclusion to interpretation</b></h3><p>For years, the default response to automation was exclusion. CAPTCHAs, rate limits, and static thresholds worked well when automated behavior was clearly distinct.</p><p>That approach is becoming less effective. AI-driven agents often provide real value to users, and blanket blocking can degrade user experience without improving outcomes. As a result, many organizations are shifting from exclusion toward interpretation.</p><p>Rather than asking how to keep automation out, teams are asking how to understand different types of traffic and respond appropriately — serving purpose-aligned experiences without assuming a single definition of legitimacy.</p><h3><b>Behavioral context as a complementary signal</b></h3><p>One promising approach is focusing on behavioral context. Instead of centering analysis on identity, systems examine how interactions unfold over time.</p><p>Human behavior is inconsistent and inefficient. People hesitate, backtrack, and explore unpredictably. Automated agents, even when adaptive, tend to exhibit a more structured internal logic. By observing navigation flow, timing variability, and interaction sequencing, teams can infer intent probabilistically rather than categorically.</p><p>This allows organizations to remain open while gaining a more nuanced understanding of activity.</p><h3><b>Ethics, privacy, and responsible interpretation</b></h3><p>As analysis becomes more sophisticated, <a href="https://venturebeat.com/security/when-ai-lies-the-rise-of-alignment-faking-in-autonomous-systems">ethical boundaries</a> become more important. Understanding interaction patterns is not the same as tracking individuals.</p><p>The most resilient approaches rely on aggregated, anonymized signals and transparent practices. The goal is to protect platform integrity while respecting user expectations. Trust remains a foundational requirement, not an afterthought.</p><h3><b>The future: A spectrum of agency</b></h3><p>Looking ahead, web interactions increasingly fall along a spectrum. On one end humans are browsing directly, in the middle users are assisted by AI tools, on the other end agents are acting independently on a user’s behalf.</p><p>This evolution reflects a maturing digital ecosystem. It also demands a shift in how success is measured. Simple counts of clicks or visits are no longer sufficient. Value must be assessed in context.</p><h3><b>What business leaders should focus on now</b></h3><p>AI-generated traffic is not a problem to eliminate — it’s a reality to understand.</p><p>Leaders who adapt successfully will:</p><ul><li><p>Reevaluate how engagement metrics are interpreted</p></li><li><p>Separate activity from intent in analytics reviews</p></li><li><p>Invest in contextual and probabilistic measurement approaches</p></li><li><p>Preserve data quality as AI participation grows</p></li><li><p>Treat trust and privacy as design principles</p></li></ul><p>The web has evolved before, and it will evolve again. The question is whether organizations are prepared to evolve how they read the signals it produces.</p><p><i>Shashwat Jain is a senior software engineer at Amazon.</i></p>]]></description>
            <category>Technology</category>
            <category>Orchestration</category>
            <category>Infrastructure</category>
            <category>DataDecisionMakers</category>
            <enclosure url="https://images.ctfassets.net/jdtwqhzvc2n1/pgt36h735abLguellfoXi/1c1e5bdccb56d453e7edef6aa0e10fa9/Web_search.png?w=300&amp;q=30" length="0" type="image/png"/>
        </item>
        <item>
            <title><![CDATA[Fixing AI failure: Three changes enterprises should make now]]></title>
            <link>https://venturebeat.com/technology/fixing-ai-failure-three-changes-enterprises-should-make-now</link>
            <guid isPermaLink="false">1PBpT6XfTXrx8gzQ4712kX</guid>
            <pubDate>Sun, 15 Mar 2026 22:00:00 GMT</pubDate>
            <description><![CDATA[<p>Recent<a href="https://www.ciodive.com/news/AI-project-fail-data-SPGlobal/742590/"> <u>reports</u></a> about AI project failure rates have raised uncomfortable questions for organizations investing heavily in AI. Much of the discussion has focused on technical factors like model accuracy and data quality, but after watching dozens of AI initiatives launch, I’ve noticed that the biggest opportunities for improvement are often cultural, not technical.</p><p>Internal projects that struggle tend to share common issues. For example, engineering teams build models that product managers don’t know how to use. Data scientists build prototypes that operations teams struggle to maintain. And <a href="https://venturebeat.com/technology/the-limits-of-bubble-thinking-how-ai-breaks-every-historical-analogy">AI applications</a> sit unused because the people they were built for weren&#x27;t involved in deciding what “useful” really meant.</p><p>In contrast, organizations that achieve <a href="https://venturebeat.com/technology/karpathys-march-of-nines-shows-why-90-ai-reliability-isnt-even-close-to">meaningful value with AI</a> have figured out how to create the right kind of collaboration across departments, and established shared accountability for outcomes. The technology matters, but the organizational readiness matters just as much.</p><p>Here are three practices I’ve observed that address the cultural and organizational barriers that can impede AI success.</p><h3><b>Expand AI literacy beyond engineering</b></h3><p>When only engineers understand how an AI system works and what it’s capable of, collaboration breaks down. Product managers can&#x27;t evaluate trade-offs they don&#x27;t understand. Designers can&#x27;t create interfaces for capabilities they can&#x27;t articulate. Analysts can&#x27;t validate outputs they can&#x27;t interpret.</p><p>The solution isn&#x27;t making everyone a data scientist. It&#x27;s helping each role understand how AI applies to their specific work. Product managers need to grasp what kinds of generated content, predictions or recommendations are realistic given available data. Designers need to understand what the AI can actually do so they can design features users will find useful. Analysts need to know which AI outputs require human validation versus which can be trusted.</p><p>When teams share this working vocabulary, AI stops being something that happens in the engineering department and becomes a tool the entire organization can use effectively.</p><h3><b>Establish clear rules for AI autonomy</b></h3><p>The second challenge involves knowing where AI can act on its own versus where human approval is required. Many organizations default to extremes, either bottlenecking every AI decision through human review, or letting AI systems operate without <a href="https://venturebeat.com/technology/dynamic-ui-for-dynamic-ai-inside-the-emerging-a2ui-model">guardrails</a>.</p><p>What&#x27;s needed is a clear framework that defines where and how AI can act autonomously. This means establishing rules upfront: Can AI approve routine configuration changes? Can it recommend schema updates but not implement them? Can it deploy code to staging environments but not production?</p><p>These rules should include three elements: <i>auditability </i>(can you trace how the AI reached its decision?), <i>reproducibility </i>(can you recreate the decision path?), and <i>observability </i>(can teams monitor AI behavior as it happens?). Without this framework, you either slow down to the point where AI provides no advantage, or you create systems making decisions nobody can explain or control.</p><h3><b>Create cross-functional playbooks</b></h3><p>The third step is codifying how different teams actually work with AI systems. When every department develops its own approach, you get inconsistent results and redundant effort.</p><p>Cross-functional playbooks work best when teams develop them together rather than having them imposed from above. These playbooks answer concrete questions like: How do we test AI recommendations before putting them into production? What&#x27;s our fallback procedure when an automated deployment fails – does it hand off to human operators or try a different approach first? Who needs to be involved when we override an AI decision? How do we incorporate feedback to improve the system?</p><p>The goal isn&#x27;t to add bureaucracy. It&#x27;s ensuring everyone understands how AI fits into their existing work, and what to do when results don&#x27;t match expectations.</p><h3><b>Moving forward</b></h3><p>Technical excellence in AI remains important, but enterprises that over-index on model performance while ignoring organizational factors are setting themselves up for avoidable challenges. The successful AI deployments I’ve seen treat cultural transformation and workflows just as seriously as technical implementation.</p><p>The question isn&#x27;t whether your AI technology is sophisticated enough. It&#x27;s whether your organization is ready to work with it.</p><p><i></i><a href="https://www.linkedin.com/company/88873/"><i><u>Adi Polak</u></i></a><i> is director for advocacy and developer experience engineering at Confluent. </i></p>]]></description>
            <category>DataDecisionMakers</category>
            <category>Technology</category>
            <enclosure url="https://images.ctfassets.net/jdtwqhzvc2n1/7pQQUo80uOntCalTMvvk5u/3bb59dea39deda40e2411cfaa9b1da07/When_AI_fails.png?w=300&amp;q=30" length="0" type="image/png"/>
        </item>
        <item>
            <title><![CDATA[The limits of bubble thinking: How AI breaks every historical analogy]]></title>
            <link>https://venturebeat.com/technology/the-limits-of-bubble-thinking-how-ai-breaks-every-historical-analogy</link>
            <guid isPermaLink="false">40DMVoijvdvHp58sv2zyFE</guid>
            <pubDate>Tue, 10 Mar 2026 13:00:00 GMT</pubDate>
            <description><![CDATA[<p>It’s always the same story: A new technology appears and everyone starts talking about how it’ll change everything. Then capital rushes in, companies form overnight, and valuations climb faster than anyone can justify. Then, many many months later, the warnings arrive, and people suddenly remember the dot-com crash or crypto.</p><p>You’ve probably seen it before. And if you have, you probably think AI is the next bubble. Humans are great at pattern-matching. We’ve evolved to see patterns, so when something familiar emerges, we instinctively map it onto the closest story we already know. We think we’ve seen it before, and we’re confident we know how it ends.</p><p>But that instinct can mislead us. AI <i>feels</i> like a bubble because we’re forcing something genuinely discontinuous into a familiar story. The idea that everything that rises quickly must ultimately collapse sounds prudent. But it doesn’t mean it’ll always be true.</p><h2><b>Why markets keep overshooting</b></h2><p>Every <a href="https://venturebeat.com/technology/karpathys-march-of-nines-shows-why-90-ai-reliability-isnt-even-close-to">major technological shift</a> produces the same outward symptoms: Inflated expectations, followed by high-visibility failure. Dot-com, mobile, and crypto all went through a phase where the world lost its sense of proportion.</p><p>Why does this keep happening? Because markets don’t have a framework for discontinuous change. Discounted cash flow models assume steady, stable growth, and comparable companies assume the category already exists. So people assume the near future looks like the recent past, but that doesn’t work when the underlying category itself is changing.</p><p>Most valuation tools are designed for incremental progress, so analysts look at quarterly forecasts and incremental improvements. They don’t know what to do with step changes, and they can’t model nonlinear adoption.</p><p>So when you see capital overshooting or extreme dispersion of outcomes, that’s the market trying to value decade-long bets using quarterly logic. (Which doesn’t work.) And that’s what a bubble actually is: An indication that no one yet knows how to price what’s coming. That uncertainty looks like invalidation, but it just exposes the limits of existing frameworks.</p><h2><b>The category error we keep making</b></h2><p>When something new arrives, we reach for comparisons.</p><p>AI is like electricity.</p><p>AI is like computers.</p><p>AI is like the internet.</p><p>AI is like mobile.</p><p>These comparisons are comforting because they all produced massive, economy-wide change, and attracted enormous capital. They changed how work got done.</p><p>They also share something deeper. Every one of those technologies extended human capability without replacing human cognition. Electricity powered machines, but humans still decided what to build. Computers processed data, but humans interpreted it. The internet moved information, but humans decided what mattered. Mobile put computing in your pocket, but human attention remained the scarce resource. In every case, human intelligence anchored everything. It was also the bottleneck.</p><p><a href="https://venturebeat.com/technology/dynamic-ui-for-dynamic-ai-inside-the-emerging-a2ui-model">AI is different</a> because it performs cognitive work. And if that makes you uneasy, it should. Because if AI can actually think, then a lot of what we’ve built our careers on, like our expertise and our hard-won skills, might not be as defensible as we thought. The junior engineer who spent years developing intuition now works alongside a tool that has it instantly. So does the financial analyst known for their variance analysis. People aren’t completely sure where value actually lives anymore, and that’s terrifying.</p><p>I talk to CFOs every week. Six months ago, they asked me abstract questions like “what is AI?” and “should we have an AI strategy?” Now the questions are concrete: “Which parts of my team’s work no longer need to be done this way?” That shift happened so quickly, it’s already changing how resources get allocated.</p><p>For example, a founder I know started using Claude to write SQL queries that used to take her analyst a couple of days. Did she replace the analyst? Of course not. But she removed the bottleneck, and doesn’t have to depend on him anymore for quick answers. Then her analyst’s role changed completely. He went from spending 60% of his time writing queries to 10% checking them and 90% on strategic recommendations. The company didn’t reduce headcount or costs, and the analyst went from supporting three stakeholders to supporting fifteen.</p><p>This is where historical comparisons really start to fail. Tools like GitHub Copilot are compressing expertise. A junior engineer can now operate at a level that once required years of work experience. And every time the tool is used, it learns. A hammer doesn’t improve just because you built a house with it, but AI tools do. And when tools get better through use, the rate of improvement compounds. That dynamic doesn’t fit cleanly into any prior technological analogy, which is why the instinct to call this a “bubble” misses the actual point.</p><p>Previous technologies assumed a fixed ceiling on human cognition. They made us faster and stronger, but the limiting factor was always the same: How many smart people could we put on a problem? AI stretches that ceiling way beyond what we’re used to. Before, understanding your business better usually meant one of three things: More data, more analysts, or more experienced leaders. The constraint was how much human attention and judgment you could afford. With AI, that constraint shifts. When analysis that once took days appears in seconds, the new constraint is knowing what to look for. What questions matter? The limiting factor stops being talent and starts being judgment.</p><h2><b>The skeptics are right about the hype, and wrong about what it means</b></h2><p>Let’s take the strongest version of the bubble argument at face value. Maybe AI actually is overhyped, and most of these companies will fail. Maybe we’re early, and real impact takes another five or ten years. All of that could be completely true, and it still wouldn’t change the core point, which is this:</p><p>Even if the majority of AI startups fail, and even if adoption is way slower than expected, AI is still the first technology that can perform knowledge work. That doesn’t disappear because markets overshoot or expectations reset. The skeptics are right that the hype is inflated. But they’re wrong that inflated hype makes the technology irrelevant. We’ve seen this before: The dot-com bubble was real, and<a href="http://pets.com/"> <u>Pets.com</u></a> crashed and burned, but the internet still changed everything. Both things were true at the same time.</p><p>The finance leaders I’m working with are beyond arguing about whether AI matters. Now they’re trying to understand which workflows change first, and how fast they need to adapt. That conversation is happening quietly, underneath all the noise.</p><p>And the workflows collapsing first share three properties:</p><ol><li><p>They require expertise, but they’re repetitive.</p></li><li><p>They’re bottlenecks to strategic work.</p></li><li><p>They’re easy to verify but hard to generate.</p></li></ol><p>These workflows are important enough to pay for, but not so strategic that automating them threatens competitive advantage. They require skill, but that skill doesn’t compound dramatically with repetition, which makes them economically fragile, and explains why they’re already being automated away.</p><h2><b>Where humans still matter (for now)</b></h2><p>AI is great at recognizing trends, and terrible at knowing which ones actually matter. It can generate variance analysis, but it can’t tell you whether a 12% swing in spend signals healthy growth or a deeper problem. It can draft strategies, but it can’t tell you which strategy fits this market and this team in this exact moment. Judgment under uncertainty, and high-stakes tradeoffs where the downside is catastrophic, remain human responsibilities. For now.</p><p>When the constraint is no longer “do we have enough smart people,” the problem becomes one of priority. What deserves attention? What’s worth building next? That’s where I see many founders get stuck. They ask if this is a bubble and if they’re too early, but those aren’t the most useful questions. The right one is: “What can I build in the next year that creates real value, regardless of what valuations do?”</p><p>The companies that last will be the ones quietly iterating and embedding AI into actual workflows that solve actual problems. Take CFOs, for example. They’re buying AI because their board wants faster variance analysis, and they’re tired of hiring analysts who quit after six months. That’s a real-world problem that companies need to solve.</p><p>And the same is true for investors. The ones who succeed long-term will be those who tolerate uncertainty long enough to see what actually works.</p><h2><b>This time is actually different</b></h2><p>In the short term, AI will disappoint. Many use cases won’t deliver what they promise, and a lot of companies formed in this wave won’t survive. But the technology will. And, over the long term, AI will reshape every field that depends on knowledge work. Not all at once, and not evenly, but a decade from now, it will be difficult to find a knowledge-based industry that looks the same as it does today.</p><p>AI is different because intelligence itself, which was historically the core constraint of human innovation, has now become scalable. That’s an observable fact with measurable consequences. The conversation about bubbles will fade, as it always does, and what will remain are the systems that quietly adapted while everyone else argued about valuations. The skeptics will have been right about the excess, and wrong about what actually mattered, because, five years from now, we’ll probably look back at today’s panic the same way we look back at people who dismissed the internet because a handful of companies failed. And the winners will be those who were building while everyone else argued about valuations.</p><p>In time, those are the only stories anyone remembers.</p><p><i>Siqi Chen is co-founder and CEO of Runway. </i></p>]]></description>
            <category>Technology</category>
            <category>Infrastructure</category>
            <category>Orchestration</category>
            <category>DataDecisionMakers</category>
            <enclosure url="https://images.ctfassets.net/jdtwqhzvc2n1/7dgxoUl5BsIQHmiQSTrmMV/a4cfce762bdaf677ebf6b9c62da275ff/Bubble.png?w=300&amp;q=30" length="0" type="image/png"/>
        </item>
    </channel>
</rss>