Microsoft’s intelligent chatbot Tay behaved badly last week (and this week too), but that shouldn’t have shocked any of us. Clearly, we’re expecting too much of AI too soon.

Flaws are what make us who we are as people — they define us. So it’s a bit of a double standard that we seem to expect no imperfections when we design human characteristics into machines.

Microsoft’s chatbot fiasco should have been predictable. If you put a child into a racist family, you cannot be too surprised with how they grow up. Similarly, if you want AI to learn from people on Twitter who interact with a supposed corporate PR stunt, then you should expect it to pick up some radical views. Interestingly, Microsoft has been operating a similarly designed service in China called Xiaoice, meaning “little Bing,” which is most likely a step towards replacing elements of customer service and it has proved quite successful. So the fact that the bot went over to the dark side in the U.S. probably says more about people using the Twitter platform than it does about Tay.

Chatbots like Tay can represent the very worst side of ourselves given enough interactions. Also, remembering that it was a human design choice that allowed Tay to develop in that very direction.

Machines learn quicker through variation and outliers, so in the Microsoft example, the high frequency of extreme comments were disproportionately weighted, skewing Tay’s personality very quickly.

Today, we’re really only able to educate AI in one specific area, be it medicine or how to drive a car. This means that we can solve one of these areas extremely well using one AI. The machine would, however, be unable to, say, paint a picture. This is the key: Machines are unable to generalize to a domain they haven’t previously observed.

Despite this, it seems we are still unwilling to forgive AI when it all goes wrong. Human decision making is incredibly unreliable and biased.

Bayesian freedoms

The current limits of AI come down to the way we think and teach mathematics and technology. Neither has fundamentally changed since the ’70s. Luckily, we have a new statistical learning paradigm (Bayesian statistical theory) at work, which we’ve been able to implement during the last few years due to recent advances in simulation theory. Basically, it allows us to state things we believe in as statistical distributions of belief and then allow data to mold these assumptions and marry the human mind with the machine in a sense. Essentially, the Bayesian inference engine is able to take every potential scenario into account by learning iteratively.

It forces human assumptions to be explicit in the mathematics, reducing the potential for unintentional human bias that still occurs in scientific research today (p-values is an excellent example of this insanity). By explicitly including our own assumptions in the process, we can vastly enhance the way we interact with and shape technological output.

There are three challenges we still need to overcome to fully realize the potential of machine learning:

1. Scalability and storage. Our current best efforts can practically construct artificial neural networks on the order of 10,000 neurons in size. These networks aim to naively simulate our brain and how we learn and store information. Compare this figure to our neural-pathways, which number in excess of 100,000,000,000 neurons, and you start to appreciate the discrepancy between the two.

2. Representing knowledge. In a neuron setting, the strength of the connection is how we represent knowledge. The problem is that the connections in real neurons are vastly more complex than just a connection strength. And here is where we need to up the science. We can throw all of the machine power in the world at creating intelligence, but if the manner in which it is portrayed is not suitable, we don’t stand a chance, as problems will just become magnified.

3. Exploring consequences. Once we have a model that can scale to any task we give it, we need to explicitly explore the consequences of it. Put simply, we need to explore every possibility by beginning with our best guess. So if we’re creating a diagnostic AI, we need to take the 100 best doctors to find out their approach to a set list of symptoms (then express these as prior probability distributions in medical efficiency). Using their answers as a base, we can employ Bayesian inference to cross-pollinate methods and come out with a best outcome that uses all of them, essentially taking the DNA from the best, replicating it constantly, and adapting it by adding data to continue its improvement.

We need to gather the finest minds to educate our machines in their given specialty. The potential is enormous, but clearly we have a long way to go before we’ve cracked a truly intelligent bot.

Michael Green, PhD, is chief analytics officer at AI media platform Blackwood Seven.