Presented by Dataiku


 “AI and the automation that it enables are at the core of the future economy,” Kurt Muehmel, chief customer officer at Dataiku, said in his opening remarks at VentureBeat Transform 2020. “The winners and losers in the coming years will be determined largely on who can leverage AI most effectively to augment 100% of their services and business processes.”

On Day 1 of Transform featuring the Technology and Automation Summit presented by Dataiku, leaders from Goldman Sachs, Chase Bank, and more shared how AI and machine learning are helping companies in the financial services sector augment both their customer experience and their bottom line.

Retailers from both the online and brick and mortar spheres, including Walmart and Zappos, recounted their efforts to optimize the shopping experience, increasing customer satisfaction and revenue.

And from the pharmaceutical industry, Pfizer offered their large-scale enterprise transformation story, driven largely by a culture shift towards AI.

Throughout the day it became clear: While data scientists and machine learning engineers will continue to be at the forefront of that effort, they can’t do it alone. This transformation will require a broadly inclusive approach, entailing collaboration between data scientists and machine learning engineers, with business analysts, mechanical engineers, research scientists, shop floor technicians and so many others, to ensure that businesses stay resilient and agile as we face an uncertain future.

The global pandemic, resulting in a historic economic downturn, combined with a massive shift to remote work, illustrates how necessary this is. This extends to the way that we produce AI capabilities and the way that we maintain and update them.

And, as Muehmel explained, the much-needed rising social consciousness about systemic racism underscores the need for builders of AI to do so in a way that is equitable without perpetuating the biased practices of the past.”This requires a broad and diverse coalition of AI builders,” he said, “as well as systems of governance and accountability throughout the entire AI development, deployment, and maintenance life cycle.”

Here’s a closer look at some of the top sessions from Day 1.

From raw data to business impact: Best practices on how organizations can put their data to work in building human-centric AI models

Building AI-powered business processes to scale has become the biggest challenge for companies implementing AI, said Muehmel. A proof of concept may have panned out, but scale isn’t 10 use cases deployed into production – it’s 10,000.

The biggest mistake enterprises make is going all-in on a single new technology and expecting it to solve all their problems. Locking into one solution can prevent a company from both planning for the unpredictable, or swapping in new technology when the old no longer fits. As Muehmel explained, Dataiku is designed to help companies do this seamlessly, working as an “insulating layer” for the compute layer.

The biggest asset any company has, and the best way to scale, whatever technology you implement, is a broad and inclusive organization in which everyone is working toward solving business challenges from the same data. Ultimately, the goal is giving everyone access to the data they need without having to worry about where it’s running, he says. That means embedding the analytics, and embedding AI processes directly into applications and dashboards throughout the organization.

The right data: Big trends on how companies are identifying the right data to train AI & ML algorithms accurately

Dataiku VP of field engineering Jed Dougherty led a panel of leading data experts about the trends they’re seeing in how companies are identifying the right data, including Slack director of product Jaime DeLanghe; PayPal VP of data science Hui Wang; Goldman Sachs senior quantitative researcher Dimitris Tsementsiz; and PwC principal Jacob Wilson.

A key challenge at Slack is unlabeled data, which makes behavioral search data difficult to parse, which can add ambiguity to their algorithms. To debug assumptions in their models, they’ve started to marry click data with survey data, which is essentially getting users to label themselves.

At Goldman Sachs, Tsementsiz explained, the challenge is that its data is often nonstationary, or in other words, some predictive tasks don’t have access to all the data they need. That means overfitting data becomes a danger; when a function is too closely tied to a limited set of data points, modeling errors result, such as when a model has access to yesterday’s average stock prices, but no today’s.

Wang talked about how PayPal is using more data points to eliminate false positives and negatives to strengthen fraud detection. For example, a typical fraud detection system will decline payments if someone lives in New York and is purchasing something from an IP in Thailand. AI technology can connect data points — such as whether that IP belongs to a resort in Thailand, or a corporate headquarters, to determine that the payment is genuine because the user is traveling or is connected to a global company VPN.

For PwC, data extraction from documents like tax forms, lease agreements, purchase agreements, mortgage contracts, and syndicated loans, among others, requires extreme sensitivity to privacy and security concerns. To help improve and secure their information extraction models over time, Wilson says, they’ve been able to turn to consistent, continuous learning pipelines.

In the end, it’s critical to loop in human intelligence for any model, Wilson says, because you can’t rely 100% on the model’s prediction; it requires secondary judgment around the output from the audit trail to how it was reviewed downstream, even going back to the full model lineage; and at scale, it means always being alert for the possibility of model drift.

How Pfizer successfully leveraged analytics and AI to scale their initiatives and achieve results

In this conversation with Kurt Muehmel, CCO of Dataiku, senior director Chris Kakkanatt of Pfizer shared how the company has transformed 170 years of technical debt into collective intelligence across the organization.

“Over the years we’ve been working together, one of the things that’s most impressed me is just how deeply ingrained data analytics and AI is to the business culture at Pfizer,” Muehmel said. “They’re operating at a pretty significant scale with thousands of projects, thousands of people participating in the AI development process, hundreds and thousands of data sets.”

Kakkanatt went on to explain how Pfizer’s journey to accomplish this took three parts.

The first was breaking down technical and functional silos. The company implemented machine learning platforms that enabled interactive point and click visualizations so that every employee, regardless of their technical skill, to work with data and build models to leverage machine learning.

“Plug and Play methodology is what we’ve seen as a gamechanger in terms of people moving away from their own silos, and saying, hey, maybe I should explore different areas,” Kakkanatt said. “We find that it really brings out the curiosity among people.”

The second step was changing how business colleagues in different areas engage with one another. Before the pandemic, they brought teams together and co-created in real time, using what-if scenarios to so that data and analytics led to decisions in real time. Now that’s done virtually.

The third step was beginning to apply AI and machine learning across the company, first starting very selectively, addressing a few business functions and business questions, in order to understand how to later scale effectively.

“We didn’t try to use machine learning for every single project,” Kakkanatt said, “but started testing, [using] different lighthouse projects to figure out, where’s the right fit for these types of initiatives. Don’t try to use machine learning and AI for every single project.”

Demystifying AI interpretability; Improving accuracy and predictability of AI models using reinforcement learning

Reinforcement Learning (RL) is a machine learning technique that solves large and complex problems in situations where labeled datasets are not readily available. Because it learns through a continuous process of rewards and punishments, it can train algorithms designed to interact with new environments.

Reinforcement learning has been used by game-playing AI like DeepMind’s AlphaGo and AlphaStar, which plays StarCraft 2. Engineers and researchers have also used reinforcement learning to train agents to learn how to walk, work together, and consider concepts like cooperation. Reinforcement learning is also applied in sectors like manufacturing, to help design language models, or even to generate tax policy.

At RISELab’s predecessor AMPLab, UC Berkeley professor Ion Stoica helped develop Apache Spark, an open source big data and machine learning framework that can operate in a distributed fashion. He is also the creator of the Ray framework for distributed reinforcement learning.

They started Ray initially with distributed learning but turned to focus on reinforcement learning because of how promising the technique was for demanding, difficult workloads, Stoica says.

The great promise of reinforcement learning is that it doesn’t require a data collection and data preparation process, but whether it’s the right solution depends very much on the problems you are trying to solve. For robotics, in most practical cases you have incomplete information, for instance, to guide a robot from point A to B, the robot may only have the information it has captured about the state of the environment — which is also a consideration in the development of autonomous cars, he notes.

Check out all the sessions from the Technology and Automation Summit here to learn more from industry leaders about their journeys in implementing these technologies, how they unlocked value and ROI, and their thoughts about what the future holds.


Sponsored articles are content produced by a company that is either paying for the post or has a business relationship with VentureBeat, and they’re always clearly marked. Content produced by our editorial team is never influenced by advertisers or sponsors in any way. For more information, contact sales@venturebeat.com.