Data literacy deep dive: An introduction to AI, ML and prediction literacy

Data, AI, ML and prediction literacy are fundamental skills in a world where your personal data, and the preferences and biases hidden in that data, are being used to influence your behaviors, beliefs, and decisions. It’s not just corporations that need this training. Data literacy should be taught in middle schools, in high schools, in universities and even in adult education and nursing homes.

In the first article of this two-part series, I introduced the four stages of the Data Literacy Educational Framework, a framework that organizations, universities, high schools, and even adult education programs can use to create more holistic data literacy training. In that article, I discussed the first two stages:

Figure 1: Data Literacy Education Framework

Now I want to complete the framework by discussing the third (AI/ML literacy) and fourth stages (prediction and statistical literacy) of the Data Literacy Education Framework.

3. AI/ML literacy

My article, "The Growing Importance of Data and AI Literacy – Part 2” broadened the data literacy conversation by introducing AI (Artificial Intelligence) and ML (Machine Learning) Literacy; that is, an introduction into how AI and ML models work.

AI/ML Literacy is understanding how AI/ML models work as they seek to optimize the KPIs and metrics that comprise the AI/ML Utility Function (and around which the AI/ML model measures decision effectiveness) as it continuously learns and adapts from the interactions with its environment.

An AI model seeks to optimize its AI Utility Function – the KPIs and metrics against which the AI model’s progress and success will be measured – as the AI model interacts with its environment. The AI Utility Function provides positive and negative feedback to the AI model (using stochastic gradient descent and backpropagation) so that the AI model can continuously learn and adapt its operations in the search for making the “right” or “optimal” decisions or actions.

Figure 2: How Artificial Intelligence (AI) Works

The AI model is trained and learns through the following process:

The AI model seeks to maximize “rewards” based upon the definitions of “value” as articulated in the AI utility function.

The AI utility function assigns values to certain actions that the AI system can take. An AI model’s preferences over possible action can be captured by a function that maps these outcomes to a utility value; the higher the value, the more the AI model likes that action. In terms of AI literacy, defining the AI utility function is critical to AI model operational effectiveness and relevance because AI systems are basically dumb systems that will continuously seek to optimize around the variables and metrics that are defined in the AI utility function.

4. Prediction (and statistical) literacy

A prediction is a statement about the likelihood of a future event.

Predictions are natural, everyday occurrences. We watch the news for predictions about tomorrow’s weather. We use GPS apps for predictions about how long it’ll take to drive to the movie theater. We read columns from sports experts who provide predictions about whether your favorite sports team will win. And in each of these situations, a human or machine “expert” is blending the patterns, trends, and relationships buried in the historical data with current operational, environmental, financial, and societal data to make that prediction.

Prediction Literacy is understanding how we leverage patterns, trends, and relationships to try to make predictions about what is likely to happen so that we can make more accurate decisions.

We inherently know that how people or devices performed in the past is highly predictive of how these humans and devices will perform in the future. Look no further than the infield shift in baseball, where baseball coaches position their infielders to infield locations where the batter is predicted to most likely hit the baseball.

Figure 4: Infield Shift in Baseball Based upon Batter Hitting Predictions

And while the SEC warns investors that a fund's past performance does not necessarily predict future results, we also know that well-managed funds over time outperform poorly managed funds (and hopefully direct our investments accordingly and not invest in that latest, hot financial trend).

This next section will likely make folks cringe a bit, to better achieve Prediction Literacy, we are going to a quick primer on the basics of statistics. Sorry about that.

Key statistical concepts

Statistics is the practice or science of collecting and analyzing numerical data in large quantities, especially to inferring proportions in a whole from those in a representative sample.

We inherently know that predictions about the future are never 100% accurate. Making predictions about what is likely to happen is based upon probabilities, confidence levels, and confidence intervals.

Probability is the likelihood (from 0% to 100%) that something is going to happen or that something is true.

For example, the probability of Barry Bonds getting a hit in his 2004 season with the San Francisco Giants was 36.2% (36.2 hits for every 100 at-bats), and his probability of getting on base when he batted that same season was 60.9% (60.9 hits or walks for every 100 at-bats...which is absolutely a stunning statistic).

Since predictions happen within a range (because predictions are not 100% certainty), we leverage variances in the data to construct those confidence intervals using confidence levels.

Variance, measures the variability of the numbers or observations from the average or mean of that same set of numbers or observations

Confidence level is the percentage of times you expect to reproduce an estimate between the upper and lower bounds of the confidence interval

Confidence interval is the range of values that you expect your estimate to fall between a certain percentage of the time if you run your experiment again or re-sample the population in the same way.

Figure 5: Averages + Variances Yield Confidence Intervals

While statistics is probably no one’s favorite topic (except both my actuarial friends), we need to understand basic statistical concepts so that we can make informed decisions in a world of incomplete and even conflicting information.

Here is a link that provides a nice overview of additional and important statistical concepts: “The 8 Basic Statistics Concepts for Data Science” by Shirley Chen.

The importance of critical thinking

Critical Thinking is the judicious and objective analysis, exploration and evaluation of an issue or a subject in order to form a viable and justifiable judgment.

In an age when data and even images can be so easily manipulated, it is important to maintain a healthy skepticism. Here are some simple critical thinking rules that can help you make more informed decisions and avoid catastrophic choices (which still doesn’t explain me being a Chicago Cubs fan).

Figure 6: Critical Thinking and Becoming “Students of Data Science”

AI, prediction and data literacy: Life is about improving the odds before rolling the dice

Data Literacy is an awareness of how our personal data is being used by organizations that are using advanced analytics to uncover our personal preferences and biases to influence the probabilities around which you make your decisions.

The Data Literacy Education Framework is comprised of 4 subject areas:

Finally, life is about rolling the dice, as there are no guarantees that you’ll get the outcomes you expect. Every time you drive a car, every time you walk across the street, every time you fly in an airplane, you are rolling the dice.

Wearing a seatbelt won’t guarantee that you won’t die in a car accident. Wearing a bike helmet won’t guarantee you won’t get hurt in a biking accident. Getting the COVID-19 vaccination won’t guarantee that you won’t catch COVID-19. It’s all about rolling the dice.

Bottom line: the practical aspect of data literacy is understanding how probabilities work and what we can do with research and analysis to make informed decisions that improve the odds so that when we do roll the dice, we get an outcome we expected and can live with. Your personal success (and ultimately the success of humankind) is highly dependent upon that understanding.

Bill Schmarzo is an author, educator, innovator and influencer with a career that spans more than 30 years.

Welcome to the VentureBeat community!

Our guest posting program is where technical experts share insights and provide neutral, non-vested deep dives on AI, data infrastructure, cybersecurity and other cutting-edge technologies shaping the future of enterprise.

Read more from our guest post program — and check out our guidelines if you’re interested in contributing an article of your own!