REDWOOD CITY, Calif. — Big data is thrown around a lot as jargon, but some powerful case studies out there show how data is reshaping industries.

Today at VentureBeat’s DataBeat/Data Summit, executives from six companies from health care, fashion, education, media, transportation, and business shared examples of how they are using data to create opportunities that never existed before — and create a more personalized experience for their customers.

Icahn School of Medicine at Mt. Sinai Health Care Foundation

Joel Dudley is the director of biomedical informatics at Mt. Sinai, which is the largest private health care system in New York State. Its 6,000 physicians conduct over 3.4 million patient visits a year.

Mt. Sinai’s CEO recently made a $100 million commitment to integrate data and genomics into every aspect of its organization to create more precise, positive patient outcomes. It brought in 100 informaticians and data scientists help doctors and nurses make data-driven decisions.

“No one is more aggressive or committed to changing health care into a data-driven, science-driven practice,” Dudley said onstage. “We need to be more predictive and understand more precisely where the patients fit into our data universe, what the optimal treatments are for them to achieve the best outcomes, and how to keep them healthy the longest.”

Dudley talked about the problem of “data exhaust” in health care — a huge amount of data comes in, but it’s never stored or looked at and “goes up into the air.” Mt. Sinai aims to use this data to create a broader view of individual patients as well as to create an overarching view of its entire population to identify trends and patterns.

Genomics plays a big part in this. Mt. Sinai has collected genomic data for 25,000 patients and combines this with electronic medical record information (EMR) and lab results to create what Dudley described as a “new taxonomy of disease.”

He gave the example of Type 2 Diabetes. By creating a map of diabetes cases, along with other genetic markers, physicians can get a more nuanced picture of their patient and prescribe treatment accordingly.


Stylitics gathers more data than ever before on women’s closets. Cofounder and CEO Rohan Deuskar said that despite decades of research and millions of dollars, brands, retailers, and publishers still don’t have access to the answers to three key questions: where women shop, what they buy, and how they wear it.

“This is an impossible dataset, but it is really valuable for the fashion industry,” he said. “This bizarre situation emerged where retailers base marketing, recommendations, and offers based on 5 percent of the information about you, but they miss the other 95 percent of your purchasing behavior. Getting this data was the challenge we set out to solve.”

Deuskar described this type of information as “best friend data” — a complex and intimate source of information that is only available from consumers themselves.

Stylistics aims to acquire this information through its “digital closet platform.”

Women enter the clothing they wear into Stylitics, with information about brand, color, style, and price. In exchange for this raw data, Stylitics gives it back to users in a more organized, accessible form. The application makes all your clothes sortable and searchable (like in Cher’s closet in Clueless), offers a smart style assistant, and makes personalized shopping recommendations.

However, the real impact is on the retail side. Merchants can use Stylitics’ database to access information about specific population segments and demographics and find out what is in the closets of the consumers they are trying to reach.

“This is the largest stream of outfit and purchase data out there, and this is shifting how brands and retailers think about consumer insights,” Deuskar said. “If they want to know what shades of pink teenage girls are likely to wear for athletic attire, they can do it without spending the time or money on traditional surveys or market research. This is turning the fat, long tail of the fashion industry on its head.”

Apollo Education Group

Data is the only way the University of Phoenix can give every one of its 300,000 students the attention they need.

The University of Phoenix was one of the first online universities and is currently one of the largest educational providers in the world. It has served 2.5 million students over the past 20 year and at its peak saw 500,000 students enrolled at one time. It currently has 36,000 staff members.

Rob Wrubel is the chief innovation officer for University of Phoenix parent company Apollo Education Group. He said the organization relies on data to figure out the when, where, and how of engaging with its students.

“The University of Phoenix is aimed at working adults, many who have been out of the school system for many years and carry significant challenges,” Wrubel said. “They are juggling time, lives, and finances, and we are under significant pressure to drive outcomes for our students, improve retention, engage in academic success, and keep our costs down.”

The university assigns multiple counselors and support staff members to every student, but it’s impossible to regularly check in with every one about their academics and finances.

Using data, the University of Phoenix is able to realize which students need interventions when and for what and allocate its resources accordingly. It is also working on adaptive learning technology to tailor the classes for each students’ needs.

“Our content and learning variability is enormous,” Wrubel said. “We have 1,700 different courses, tens of thousands of pieces of content, and multiple types of learners coming from very different backgrounds from very different life circumstances. Data provides a low-cost, high-return way to preserve our students and pull out insights about complex behaviors.”


Sean Knapp is a founder and the chief product officer for Ooyala, an online video company.

Knapp started his talk with a striking fact: The average American spends almost as much time watching TV as they do at work, he said.

That adds up to an enormous waste of time — or a huge marketing opportunity, depending on which side of the screen you’re on. But the problem is that TV broadcasters actually know little about their audience; their best information is based on Nielsen-like sampling.

That’s quickly changing, however, with the shift to online video delivery — which will in turn generate an enormous amount of data and many opportunities for media companies to optimize what they deliver to you, how many ads to show and what kind, and more.

In the next two years, 2.2 billion people will watch online video. Already, 27 percent of adults watch videos on things besides their TVs every day, Knapp said.

“We’re now collecting data from hundreds of millions of users every day, week, and month and putting it back into a system where we have to do something with it,” Knapp said. “Ultimately, the biggest opportunity that we see in the TV industry is, how do we optimize this?”

For example, Netflix recommends TV shows and movies for you to watch based on what you are actually watching — a far better and more personalized, data-driven approach than Nielsen polling.

In the future, advertisers can take advantage of similar data streams to optimize what they show you. Currently, the average hour of broadcast TV includes 16 minutes of ads, regardless of who you are or what you’re watching.

But with online video, advertisers can individualize their ads to your situation — for instance, a channel might show you no ads at all during the first hour that you watch “Breaking Bad” to help ensure that you get hooked on the show without interruption.

Ultimately, Knapp envisions a data-driven video future where “there are no channels any more … what you get is a finely tuned and personalized experience.”


Splunk works with thousands of enterprise customers, including most of the Fortune 100 and various government agencies, to analyze and interpret their data. Chief strategy officer Stephen Sorkin narrowed the scope of his case study to “planes, trains, and automobiles.”

“We are working with Ford to build its OpenXC platform, where you collect data off cars,” he said. “You will see more of this type of offering more car manufacturers, as they measure all sorts of attributes about what the car and driver are doing. We are also using data to solve problems, like making sure telephones don’t cut out when flying over the ocean.”

Splunk’s mission is to “make machine data accessible, usable, and valuable to everyone.” This is an increasingly relevant issue with the rise of the Internet of things and advancements with connected cars — two of the hottest trends of the year.

Connected vehicles and devices can generate a tremendous amount of data, and as Dudley said before, this can turn into “exhaust” (a more apt analogy for cars, perhaps?). Splunk’s goal is to make this data actionable so the data coming from planes, trains, and automobiles is useful to both consumers and businesses.


Ashok Srivastava said he had his dream job at NASA, before he took the position of chief data scientist at Verizon. He said his interest lies in using data and technology to create a valuable social impact.

“We have a massive network that processes 5 petabytes of data per day,” he said. “What can be done with this data to improve the public good and drive revenue for the company?”

Srivastava discussed using machine learning and real-time data to make decisions when a natural disaster strikes. He also talked about Verizon’s Precision Marketing tool, which pulls data from its massive wireless network to provide “360 degree” views on target audiences. This enables advertisers to target mobile users using criteria such as ZIP codes, demographics, and interests.

Srivastava also talked about making data a core part of strategy at a large company like Verizon.

“We are really facing a brand-new direction at the company,” he said. “We have different people with different views and observations. You have to take everyone’s opinion and meld it with your own strategy. In a company of 200,000 employees, each has a different view on how data should be used and you need to work in an agile framework.”

Additional reporting from Dylan Tweney.