Anthropic wants to own your agent's memory, evals, and orchestration — and that should make enterprises nervous

Just a few weeks after announcing Claude Managed Agents, Anthropic has updated the platform with three new capabilities that collapse infrastructure layers like memory, evaluation, and multi-agent orchestration, into a single runtime.

This move could threaten the standalone tools that many enterprises cobble together.

The new capabilities — 'Dreaming,' 'Outcomes,' and 'Multi-Agent Orchestration' — aim to make agents inside Claude Managed Agents “more capable at handling complex tasks with minimal steering,” Anthropic said in a press release.

Dreaming deals with memory, where agents “reflect” on their many sessions and curate memories so they learns and surface unknown patterns. Outcomes allows teams to define and set specific rubrics to measure an agent's success, while Multi-Agent Orchestration breaks jobs down so a lead agent can delegate to other agents.

Claude Managed Agents ideally provides enterprises with a simpler path to deploy agents and embeds orchestration logic in the model layer. It’s an end-to-end platform to manage state, execution graphs, and routing. With the addition of Dreaming, Outcomes and Multi-agent Orchestration, Claude Managed Agents expands capabilities even further and directly competes with tools like LangGraph or CrewAI, as well as external evaluation frameworks, RAG memory architectures, and QA loops.

An integration threat

Enterprises must now ask: Should we ditch our flexible, modular system in favor of an agent platform that brings almost everything in-house?

Anthropic designed Claude Managed Agents to share context, state, and traceability in one place. This means the platform sees every decision agents make, rather than enterprises having to wire separate systems together. It sounds practical to have one platform that does everything. But not all enterprises want a full-service system.

Claude Managed Agents already faces criticism that it encourages vendor lock-in because it owns most of the architecture and tools that govern agents. In the current paradigm, an organization may run Managed Agents but keep multi-agent orchestration, memory, or evaluations in a separate space ensures flexibility.

The platform offers a fully-hosted runtime, which means memory and orchestration run on infrastructure the enterprise does not own. This can become a compliance nightmare for some organizations that have to prove data residency.

Another problem to consider is that enterprises already in the middle of large-scale AI transformations must cobble together workarounds to deal with the constraints of their tech stack. Not every workflow is easily replaceable by switching to Claude Managed Agents.

Dreaming and outcomes against current tools

Most enterprises have a fragmented approach to AI deployment.

For example, they may use LangGraph or Crew AI for agent routing and workflow management, Pinecone as a vector database for long-term memory, DeepEval for external evaluation, and a human-in-the-loop quality assurance to review some tasks. Anthropic hopes to do away with all of that.

With Dreaming, Anthropic approaches memory by allowing users to actively rewrite it between sessions, so the agent essentially learns from its mistakes. Anthropic says this capability is useful for long-running states and orchestration. Current systems often handle memory persistence by storing embeddings, retrieving relevant context, and adding more state over time.

Outcomes addresses the evaluation portion by detailing expectations for agents. Instead of external quality checks, which are often done by a team of humans, Anthropic is bringing evaluation into the orchestration layer rather than above it.

But it’s the Multi-Agent Orchestration capability that pits Claude Managed Agents against orchestration frameworks from Microsoft, LangChain, CrewAI, and others. Model providers like Anthropic and OpenAI have already begun pushing aggressively into this space, arguing that bringing this to the model layer gives teams better control.

Big decisions to make

Enterprises face a big decision, and this one could depend on where they are in agent maturity.

If an organization is still experimenting with agents and has not deployed many in production, they may find moving to Claude Managed Agents and configuring Dreaming and Outcomes to their needs much easier. This is the stage of development where, even if enterprises are using a third-party orchestrator like LangChain, they’re still customizing it.

But for those who are already further along in the process, the calculation becomes trickier. It’s now a matter of parallel evaluation and better understanding of their processes.

Businesses, though, will face the same decision even if they don’t intend to use Claude Managed Agents. Anthropic has signaled that other model and platform providers will likely shift their product roadmaps to a similar model that keeps everything locked in the same system — because models may become interchangeable, but the tooling and orchestration infrastructure will not.

An integration threat

Dreaming and outcomes against current tools

Big decisions to make

More