Dell, with more than 103,000 employees globally, is one of the largest technology companies in the world. In 2017, it was the third-largest PC vendor after Lenovo and HP, and analysts peg its market capitalization at $70 billion.

The Round Rock, Texas firm sells network switches, peripherals, laptops, workstations, HDTVs, cameras, printers, servers, and MP3 players, to name a few categories. But in the years since its 2009 acquisition of IT services provider Perot Systems, it’s invested heavily in storage and networking solutions for enterprises.

Arguably the biggest push came in 2016 with the $67 billion purchase of EMC Corporation — the largest acquisition in Dell’s history. It saw the reorganization of Dell into Dell Technologies Capital, and the consolidation of its divisions into three subsidiaries: Dell Client Solutions Group, its consumer and workstation business; Dell EMC, its data management hardware and software arm; and cloud computing and virtualization services platform VMware.

Matt Baker

Above: Matt Baker, senior vice president of Dell EMC strategy and planning.

Image Credit: Dell Technologies

Today, Dell Technologies is pointed strategically at AI, data management, and the internet of things. It announced at an event in New York last year the formation of a new IoT Division — part of a three-year, $1 billion in IoT research and development. And in August, Dell EMC took the wraps off of Ready Solutions for AI, an offering consisting of AI frameworks and libraries, as well as compute, network, storage, and consulting and deployment services.

Dell Technologies’ ever-expanding enterprise suit comprises Dell EMC PowerEdge C-Series servers, which are optimized for artificial intelligence (AI) model training and batch processing, and Dell EMC Isilon and Elastic Cloud Storage, complementary network-attached storage platforms for high-volume unstructured data backup and archiving. On the cloud-based workloads and analytics side of things, there’s Pivotal Cloud Foundry, Virtustream Enterprise Cloud, and Boomi. And that only scratches the surface.

Ahead of a media event in Chicago next week, VentureBeat sat down with Matt Baker, senior vice president of Dell EMC strategy and planning, for a wide-ranging discussion about Dell Technologies’ present and future — specifically its current product lineup, customer success stories, and how it’s approaching the omnipresent problems of data privacy and transparency.

Here’s an edited transcript of our interview.

VentureBeat: To kick things off, could you talk about Dell’s approach to analytics, IoT, and AI? Just a broad, big-picture overview to help set the stage.

Matt Baker: Sure. Dell consists of a number of large and smaller entities — Dell EMC being the one that’s focused on data center infrastructure, server storage, networking, and solutions. And of course, Dell Technologies also is VMware, a company called Boomi that I’ll talk about in a little bit, and so on and so forth. In my role at Dell EMC, I’m responsible for basically planning the business, as well as some degree of product and technology oversight.

The thing that I’d like to point out is that we’ve been heavily involved in ecosystem development — from enablement and infrastructure as well as creating, if you will, best practices and solutions — along with orderable solution sets to streamline what in many cases are very disjointed open-source environment data-centric technologies. Specifically, we have launched a number of platforms over the last year that are designed to accommodate a greater density of accelerators such as FPGA, VP, and GPUs.

Another important part of our R&D space is Dell Capital, Dell’s independent venture capital arm, as well as EMC’s own VC group. Collectively, they’ve invested in a number of products, from the software stack all the way down to the core silicon. Examples are a company called Graphcore, which we lead investments in, as well as Noodle.ai. In fact, a third of the investments we’ve made since 2017 have been focused on advanced data-centric workflows.

VentureBeat: Let’s talk about what some of these data-centric solutions look like in the wild — maybe case studies or use cases that you can think of, or specific customers who’ve taken advantage of your product offerings and really run with them.

Baker: Sure thing. One that comes to mind is MasterCard. They’re investing in fraud detection and prevention, which we’ve been working to develop for them. They are, of course, a large company with a lot of capabilities, and so we’ve been trying to match up their wants and needs with our infrastructure.

Another example is Commonwealth Scientific. They’re an industrial research organization that’s developing software around vision — not restoring it in the classical sense, but enabling machine vision with humans in order to facilitate some degree of synthetic vision for those who’ve lost their sight completely.

I would say the one area that’s a little underserved right now are smaller, less sophisticated companies that don’t have large budgets. And those are the folks that we’re really targeting with these finished Ready Solutions, which aim to help them to accelerate the adoption of new technologies.

VentureBeat: Let’s dive into some of the solutions in your portfolio. How are you helping to cut down on the amount of time and effort required of your customers’ data science teams? What are some of the tools you’ve made available?

Baker: A thing I would mention is that, if you read through reports from research firms like Forrester, one of the biggest challenges customers face today is around data pipeline management. They hire these very well-educated, sophisticated, and frankly well-compensated data scientists who end up spending 80-plus percent of their time doing data engineering work — grunt work like identifying datasets and cleansing them. What we offer are real-time extract, load, and transform (ETL) capabilities that allow data scientists to build and maintain data pipelines rather than having to spend all day gathering up data and preprocessing it.

We’re also seeing adoption in more advanced data-centric workload spaces like Boomi. Boomi is our integration platform as a service (IaaS), and it has hundreds of available data integration points that allow you to build a data pipeline and workflow that constantly keeps datasets up to date. In complex organizations, pulling that data together is a really big task.

VentureBeat: You mentioned that an issue enterprises are facing is juggling disparate data pipelines. What about the decisions they’re having to make regarding on-premises solutions versus in the cloud? How are you helping them to approach and tackle that problem?

Baker: I would say a couple of things. One, from an operational standpoint, we are working to build and establish Dell Technologies as a leader in hybrid multi-cloud — mostly through VMware. VMware today has over 600,000 customers and millions of clusters, in addition to third-party integrations that allow customers to access and manage instances from alternative cloud providers.

So again, we’re enabling the hybrid multi-cloud, and we’re doing that through a tool and capability that the vast majority of IT folks are already using for on-premises workloads. Quite simply, we’re extending it to manage stuff in the cloud as it relates to data center workloads. We see a number of customers who are experimenting with different frameworks, and the frameworks typically are tied to different implementations of AI and ML acceleration — Google’s TensorFlow being the one people bring up most often. They’re looking to do experiments with these frameworks through hybrid multi-cloud instances that offer different capabilities.

From our perspective, we’re a bit of an infrastructure company. What we want to do is make these capabilities available to our customers in the most seamless way possible, and that’s what we’re building out through our hybrid multi-cloud solutions with VMware.

That being said, we see an increasing number of customers looking to leverage datasets that are already on-premises in an offline manner. The reason is, data management can be cost-prohibitive in the cloud. And frankly, it’s just slow. If you’re looking for real-time insight, you have to gather the data up into a dataset that is nearby so that it can be operated on in real time.

VentureBeat: I’d like to shift gears a bit and talk about privacy and transparency. When you’re dealing with all this data — and then sometimes it’s customer data — privacy concerns emerge. Could you talk about what Dell EMC’s approach to transparency is, and how you’re keeping that in mind with your solutions?

Baker: This is a broader industry challenge. You mentioned transparency, but there’s a number of other points that are important.

We have so many unfortunate examples of bias in AI, for example. AI, at its core, is really just human thinking codified into algorithms, and those algorithms can capture and amplify the bias of programmers.

The other issue, of course, is that by using data, you’re spreading it around. So it’s not only a transparency issue, which I think is something that requires a code of conduct or a strong point of view on the ethics of how you’re utilizing data that you captured. That’s not something that a company like Dell can solve for our customers other than bringing it to the attention of those working to implement it.

The flip side of that is that once you start using data a lot, there’s suddenly a lot of data lying around. One of the big challenges I mentioned around managing data pipelines is around data pipeline governance, like who has access to it. A lot of data is by definition of customer data — it is regulated data to a degree. So building a data integration platform that handles things like anonymization through a governance or policy platform are all things we’re building into our tools.

It’s ultimately a question on governance, and how you solve for the governance problem as you proliferate the use of data that is largely gathered by your interactions with customers. Customers want to trust that you’re using their data in an appropriate way, and not exposing their data to the people who might use it in nefarious ways.