We’re at a strange crossroads in the evolution of AI. Nearly every enterprise wants to harness it. Many are investing heavily. But most are falling flat. AI is everywhere — in strategy decks, boardroom buzzwords and headline-grabbing POCs. Yet, behind the curtain, something isn’t working.

Recent data makes this paradox impossible to ignore. The MIT State of AI in Business 2025 study finds that despite $30 to 40 billion poured into GenAI, 95% of organizations see zero measurable return — with only 5% managing to cross what the researchers call the gen AI Divide.

The core issue isn’t model quality or regulation, but the inability of most systems to learn, adapt and integrate into workflows. And it’s not just one report. A study from Qlik and ESG shows that while 94% of businesses are ramping up AI investments, only 21% have managed to operationalize anything meaningful. Informatica’s CDO Insights study echoes the same: data quality, lack of readiness and technical immaturity are the primary reasons most AI projects fail to scale. That last part matters. These aren’t just failed experiments — they’re signs that something deeper is broken.

Enterprises struggle not for lack of ambition, but because legacy frameworks don’t account for the structural complexity of AI — data readiness, model maturity, worker expectations, human-AI alignment and dynamic governance still largely go unmeasured.

Why legacy frameworks like RICE are failing AI

One of the most widely adopted prioritization models in product management is RICE — which scores initiatives based on Reach, Impact, Confidence, and Effort. It’s elegant. It’s simple. It’s also outdated.

RICE was never designed for the world of foundation models, dynamic data pipelines or the unpredictability of inference-time reasoning. Here’s where it breaks down:

Reach in RICE is based on absolute user numbers — making it hard to normalize across teams and prone to estimation inflation.
Confidence is often a gut-feel metric. In AI, that’s dangerous. Model maturity, data readiness or hallucination risks don’t get factored in.
Effort usually assumes code complexity, not the massive overhead of acquiring, cleaning and governing AI-ready data.
Impact is assumed to be direct and measurable. But in AI, “impact” depends on whether the model generalizes, whether it augments human decisions or replaces them, and whether it behaves consistently under real-world complexity.

Classic methods like RICE, ICE and MoSCoW evaluate reach, impact, confidence and effort — but they omit AI-specific realities. They fail to account for data readiness, assume that impact is inherently measurable and generalizable and overlook critical factors like model feasibility, hallucination risk and alignment with worker expectations. As a result, they often over-prioritize flashy experiments at the expense of proven, reliable outcomes. What’s missing are metrics that assess model maturity, data governance, human agency and the frequent mismatches between desire and capability — gaps that routinely lead AI projects to fail.

When ambition collides with reality

Apple’s paper, The Illusion of Thinking, delivers a sobering reality check on the limitations of current large reasoning models (LRMs). The study shows that these models often falter under complexity, fail to generalize beyond their training data and behave inconsistently even on tasks that appear similar. If AI models can’t be trusted to generalize, how can we confidently prioritize AI features that rely on them? And how can traditional frameworks like RICE, which don’t even ask whether a model can actually solve a problem, possibly suffice?

The answer: They can’t

To make matters worse, there’s a growing mismatch between what enterprises want to automate and what AI can realistically handle. Stanford’s 2025 study, The Future of Work with AI Agents, provides a fascinating lens. Researchers mapped more than 800 tasks across 100-plus job roles and found that while workers want AI to assist with nearly half of their workload, most AI initiatives target the wrong tasks — either things workers don’t want automated or problems AI models aren’t ready to solve.

They introduced something called the Human Agency Scale — ranging from full human control to full AI autonomy. And the sweet spot, it turns out, is somewhere in the middle: Where AI helps, but doesn’t replace. That nuance rarely shows up in our product planning.

So let’s recap: We’ve got overinvestment, under delivery, misalignment with human preferences and frameworks that haven’t evolved in a decade.

What now?

Introducing ARISE: AI-native product thinking

That’s where ARISE comes in.

ARISE stands for AI Readiness and Impact-Scoring Evaluation. It’s a new framework we’re proposing for product teams navigating the complexity of AI. Built as an AI-native evolution of RICE, it bridges ambition and realism via a structured and multidimensional approach.

The foundation is familiar: You still evaluate reach, impact, confidence and effort. But even those are modernized.

Reach isn’t some vague number of users — it’s normalized to a percentage of active users.
Impact is rated on a consistent scale from 0.25x to 3x.

Impact	Description	Example
3.0	Massive impact	Resolves user problem autonomously, reduces cost or time dramatically
2.0	High impact	Significant benefit or automation of high-friction workflows
1.0	Medium impact	Moderate UX or workflow improvement
0.5	Low impact	Incremental enhancement, useful but not critical
0.25	Minimal impact	Cosmetic or low-priority edge case improvement

Confidence is no longer hand-wavy; it’s grounded in real feasibility checks.
Effort is counted in person-months, factoring in the full pipeline, not just the model.

But ARISE adds three crucial layers that traditional frameworks miss:

First, AI Desire — does solving this problem with AI add real value, or are we just forcing AI into something that doesn’t need it?

Second, AI Capability — do we actually have the data, model maturity and engineering readiness to make this happen?

And third, Intent — is the AI meant to act on its own or assist a human? Proactive systems have more upside, but they also come with far more risk. ARISE lets you reflect that in your prioritization.

The final score is calculated as a base RICE score multiplied by those three AI-specific factors. That’s an intentional design choice. If your AI Capability is weak — even if Reach and Impact look good — your ARISE score tanks. No more greenlighting aspirational projects just because they sound exciting.

The final ARISE score is:

ARISE Score = (Reach * Impact * Confidence / Effort) × AI Desire × AI Capability × Intent Multiplier

This multiplicative model forces realism. A high-impact initiative with low AI Capability will score poorly — exactly as it should.

Feature	Traditional RICE	ARISE modified RICE	ARISE AI-Specific dimensions	Key differences/benefits
Reach	Absolute number of users affected (unbounded, lacks normalization, prone to inflation)	Percentage of active users (0-100%)	N/A	Normalized scale, encourages user persona analysis, aligns with modern KPIs.
Impact	Subjective, often unanchored scale	Consistent 0.25x-3x scale (Minimal to Massive)	N/A	Reduces subjective inflation, enables fairer comparisons.
Confidence	Certainty in reach and Impact (generic)	Certainty in Reach and Impact (0-100%)	N/A (but heavily influenced by AI Capability)	More grounded, especially when informed by AI Capability.
Effort	Person-months to deliver	Person-months to deliver	N/A	Captures development cost.
AI desire	N/A	N/A	Strength of market/internal need for AI solution (1-5)	Forces clear articulation of AI-specific value and need, prevents ROI misalignment.
AI capability	N/A	N/A	Readiness of data, model maturity, engineering feasibility (1-5)	Directly addresses AI-specific realities, data quality, technical maturity and intrinsic model limitations. Crucial for viability.
Intent multiplier	N/A	N/A	Proactive (1.2x) vs. Reactive (1.0x)	Accounts for higher risk/complexity of autonomous AI, encourages human-AI alignment.
Overall score	Base RICE = (R * I * C) / E	Base RICE * AI Desire * AI Capability * Intent	Multiplicative nature	Forces holistic AI thinking; low score in any AI-specific dimension significantly reduces priority, preventing unviable projects.

From theory to action

Let’s take this out of theory and into practice.

Imagine your team wants to roll out an AI coding assistant to every developer in the company. Reach? 100%. Impact? Let’s say 2.0 (a 50% productivity boost). Confidence? Pretty high — this is vendor-integrated, not a moonshot – 1.0. It’ll take two person-months to implement. On the AI side, Desire is sky-high (5). Capability? Solid (4). But AI is reactive — it assists, doesn’t act autonomously – so Intent is 1.0

Plug those numbers into ARISE and you get a score of 20.

Now take another project: A custom-built, proactive AI system that monitors logs and automatically troubleshoots developer issues across multiple systems. Huge reach (1.0), high potential impact (2.5) — but only moderate confidence (0.5). It’s complex, requiring a 12-month build and the AI Capability is shaky (2). It also acts autonomously.

Despite its promise, ARISE gives it a score of 1.0. Not a “no” — but a clear signal to break it into smaller spikes or defer it until your data and models mature.

So, which one should you prioritize first?

A new operating system for AI product management

What makes ARISE powerful isn’t the math — that’s simple. It’s the mindset it encourages.

It forces teams to confront what’s real — not just what’s exciting. It prevents you from burning cycles on experiments that will never scale. And it invites a deeper discussion about human-AI collaboration, technical feasibility and organizational readiness.

More importantly, it keeps you grounded in value. With gen AI at the peak of inflated expectations, ARISE is the sober lens product teams need to separate noise from opportunity.

We’ve seen too many teams fall into the trap of building impressive demos that never leave the lab. ARISE helps them ask the right questions earlier — about data, ethics, intent — and adjust course accordingly.

It’s time to prioritize differently

AI isn’t just another technology layer. It’s a new way of building, reasoning and collaborating. Rhat demands a new way of thinking about product management.

RICE, ICE, MoSCoW — those were great for their time. But they weren’t built for probabilistic systems, autonomous agents, or the ambiguity of large language models (LLMs).

ARISE is.

It’s not a silver bullet, but it’s a compass. A way to bring structure to the madness. A tool that helps you build what’s not just possible, but viable, valuable, and responsible.

Because in the world of AI, the biggest risk isn’t moving too slowly. It’s prioritizing the wrong thing entirely.

Hitha Kishore is product management lead for analytics and AI at Docusign.

Welcome to the VentureBeat community!

Our guest posting program is where technical experts share insights and provide neutral, non-vested deep dives on AI, data infrastructure, cybersecurity and other cutting-edge technologies shaping the future of enterprise.

Read more from our guest post program — and check out our guidelines if you’re interested in contributing an article of your own!