The Anti-ChatGPT: Thomson Reuters' multi-agent system slashes 20-hour tasks to 10 minutes

What if the future of enterprise AI isn’t about speed, but depth? This is Thomson Reuters Westlaw’s bet; the company’s Deep Research platform was specifically designed to take its time, working an average of 10 minutes. This allows the multi-step research agent to plan, execute and pull from a deep, curated dataset of more than 20 billion docs — up-to-date case law, statutes, administrative rulings, secondary sources and structured legal editorial content. On the back end, it connects with a highly developed toolset that attorneys can use to check findings and probe deeper into legal scenarios. Unlike standard retrieval-augmented generation (RAG) systems, Deep Research is designed to eliminate errors and hallucinations, providing direct citations from Thomson Reuters’ vast dataset. The result is an AI agent that mirrors the rigor of human legal research, providing legal nuance and reducing the time attorneys spend on discovery. For enterprises beyond law, the system offers a blueprint for how AI can move past speed into substance, indicating that slowing AI down can provide real business value. “We're going a lot deeper with this iterative, agentic process to not just quickly answer a question, but give the lawyer the perspective of both sides, which really helps them understand the nuance of the issue,” Mike Dahn, head of Westlaw Product, told VentureBeat.

RAG was a start, but doesn't go deep enough

Deep Research on CoCounsel is embedded into Westlaw, Thomson Reuters’ legal research platform used by 12,000-plus law firms, more than 4,000 corporate legal departments and the majority of the top U.S. courts and law firms. The system’s deeply analytical, multi-step research process required moving beyond simpler RAG tools. The team built the model to systematically break down hypotheses, analyze responses rather than just fetching them, “follow the breadcrumb to the next case, then iteratively update the research plan and execute through it,” Dahn explained. While humans can view the discovery process, it is fully executed by the agent. “It's the next level of the agentic process, compared to just using RAG as an upfront solution,” said Dahn. Westlaw did previously use RAG-based tools that returned lengthy lists of cases, statutes, regulations, articles and book chapters that humans would then have to sift through. “Lawyers would tell us that the answers were helpful and a big head start for research, but that they would typically want to do a lot more digging,” said Dahn. Essentially, they aren’t just looking for general legal information, but representative cases to support certain laws, make arguments and provide the “most compelling authority” to cite in a particular jurisdiction. Because, when it comes down to it, the law has many shades of gray, and cases can be successfully argued one way or another. Deep Research helps lawyers make analogies to previous cases and pull facts from them, arguing that their client fits more into ‘case X’ than ‘case Y’ and have evidence to back or refute claims. For instance, a lawyer might ask the system whether a customer list can be protected as a trade secret in a specific scenario in a certain jurisdiction. The system will come back with arguments and cases where trade secrets were found to be protected, and cases where they were not. “We've built a product that digs into the nuance and goes deeper into the perspective from both sides of the law,” said Dahn. Typically, lawyers can spend 10 to 20 hours performing research for complex legal matters. While Westlaw doesn’t yet have definitive numbers, Deep Research is speeding up that time “dramatically,” Dahn said, while also surfacing relevant materials to advise their clients, produce better briefs and motions and litigate more effectively. While Deep Research’s default option is 10 minutes, there are seven-minute and three-minute versions available; the team is also working on a longer 20-minute version. Dahn noted that, while devs or researchers in the lab are often looking to make models faster and faster, lawyers aren’t seeking instant gratification; they actually prefer longer output options. “We find that the more time we give the agents to discover the right law, the more time we give them to reason through it, the better answers we can provide,” said Dahn. “We're very comfortable with the idea of actually making it slower if we can do an even better job for the user.” But there can be diminishing returns, of course, and it’s important not to include “too much noise.” As Dahn noted: “We'll keep exploring the limits.”

Multiple agents in the mix, a rich dataset

Westlaw isn’t wedded to one underlying model; the team has a multi-model strategy, with different models chosen based on their effectiveness with different tasks. Westlaw uses frontier models as well as those from OpenAI, Anthropic and Google, explained Omar Bari, engineering lead from Thomson Reuters Labs and technical architect of Deep Research. “We have been early testers of all models as they come out, our general strategy is to have access to as much intelligence as possible,” he said. “We also experiment with fine-tuning open-source models.” These systems rely on a rich, deep, well-curated dataset incorporating case opinions from courts, statutes from state and federal legislatures and administrative rulings from bodies like the Securities and Exchange Commission (SEC) or the Environmental Protection Agency (EPA), as well as a “huge collection” of secondary sources, such as articles, books and treatises from practitioners, scholars and third party publishers, Dahn explained. On top of this, Westlaw has an internal team of hundreds of attorney editors who stay on top of laws coming out of the courts and legislatures every day and tag them in the company’s classification scheme. They also flag when specific laws have been made invalid. “Our Deep Research process is running over that curated, up-to-date, well-organized collection of the law and the secondary sources, as opposed to running across the web, which is a very big difference between what we're doing and what you would get if you just used one of the frontier models,” Dahn explained.

Hallucinations versus incorrect facts

Unlike open-ended chat interfaces, research reports in Westlaw Advantage provide inline citations that lawyers can click on to take them directly to the case, which “significantly closes the verification loop,” Dahn said; every conclusion is tied to real legal sources that can immediately be verified. ‘Hallucination’ can be a loaded term, he noted, as it refers to models “just wildly making something up that you know isn’t true,” say, citing a ‘subsection four’ of a statute when there are only three subsections. But with Deep Research, those errors are easily identified and flagged. “We don't have made-up cases and made-up statutes in our collection, so it would be impossible to click a link to go to something that doesn't exist,” Dahn explained. Instead, he sees a bigger issue of models making errors. For instance, a lawyer may ask a question of ChatGPT or Gemini (‘how many people attended this political rally?’). The model may then arbitrarily pull from a list of articles, not necessarily the most up-to-date one from the most reputable source. “In that situation, the model is not really making something up, it just grabbed the wrong information,” said Dahn. “That's a much bigger problem than hallucinations, because it fails to find the right information.” In the end, Dahn underscored the fact that no gen AI system is 100% hallucination-proof, and that “it's going to be quite some time before any system like this gets to 100%.” To assist lawyers in the verification process and help minimize mistakes and risk in high-stakes legal work, Westlaw provides lawyers with a “very rich toolset.” “We have what we believe is the best collection of content for the legal market, which we've been developing for over 100 years,” said Dahn. “But we also have this toolset that lawyers can use with manual research.” For example, with the classification tool ‘West Key Number System,’ a lawyer can find related cases on a specific topic, even if they use a very different language. The tool KeyCite, meanwhile, connects lawyers from a case or statute to validity information (i.e., ‘Is the law in this document still valid, or has it been invalidated by subsequent laws?’) This allows lawyers to find documents citing to the one they're interested in or cited with the one they're interested in. “Both are signals of relevance that go well beyond the words used in a search or in a particular document of interest,” said Dahn. “AI agents can use these and other Westlaw research tools to iteratively work through complex legal research problems like an expert lawyer would.”

RAG was a start, but doesn't go deep enough

Multiple agents in the mix, a rich dataset

Hallucinations versus incorrect facts

More