Eager for AI? First you have to train it

Artificial intelligence is a technology at heart, but the way it integrates into the enterprise data ecosystem is unlike any tech that has come before. For one thing, AI will be able to do very little right out of the box. To get it to work properly, you have to train it, and it seems that few organizations fully comprehend what a lengthy and complex job that is.

In many ways, we can draw parallels between the introduction of AI today and the introduction of the consumer PC in the early '80s. The PC, after all, was going to remake life as we know it by managing our budgets, organizing our bills, keeping our shopping lists, helping with homework, and delivering a cornucopia of other surprises. What they didn't tell us was that we had to perform a little task called data entry before the computer could do all of these magical things. And before long, virtually every house in the developed world had a PC in the corner of the living room gathering dust.

Teaching the bot

AI is not likely to suffer the same fate because it will (or should) have a team of dedicated professionals whose job it is to make it work. But the training process will still take some time, and it may be a while before it produces even marginal results.

On Medium, writer Sherise Tan says the speed and efficacy of the training process comes down to four key factors: hardware, optimization, the number of layers in the neural network, and the size of the dataset. The better the hardware and the greater degree it can work as a single entity, the easier the process will be. A more complicated neural network and more data to crunch will tend to slow things down. In the end, though, training consists of positive and negative reinforcement -- getting it to produce correct answers and discouraging incorrect ones.

It is also important to note that the initial training is only the first step in the process. It must be backed up by validation and testing. Each step requires multiple cycles with constant adjustment of parameters in order to ensure that the AI is making accurate predictions with each new cycle.

When it comes to hardware, some companies are not waiting for traditional manufacturers to develop the right products for AI training. Tesla, for example, recently unveiled a new processor containing 50 billion transistors specifically to run training cycles for its own AI programs. The D1 Dojo cranks up to more than 360 teraflops of computing power using a mesh of 64-bit CPUs measuring 645 square millimeters, which is pretty large as far as chips go. Apparently, the company decided to create its own device after determining that commercial offerings by Intel, Nvidia, and others did not suit the unique demands of its AI-driven processes.

The wrong lesson?

Compute power alone will not make AI a success, however. The way in which the training is conducted will have the greater impact, and so far most training methods are seriously flawed, according to a team of researchers at Google. As lead researcher Alex D'Amour explained to MIT Technology Review, the fundamental problem is that the data used in training is rarely, if ever, adequate to guide AI through a real-life situation. This results in AI not just passing its training cycles and then failing in practice, but failing in ways that neither the AI nor its human operators would notice. And this could have devastating consequences for applications ranging from transportation to medical imaging.

What's needed, says author and AI researcher Melanie Mitchell, is a way to get AI to think in analogies, like a human brain does. As she explained to Quanta Magazine recently, when people encounter situations that are new to them, they use analogies of past experiences to work them out. By building AI training on logic and programming, we can teach a neural network to recognize a picture of a bridge but not to comprehend the abstract nature of other forms of the word "bridge," as in "to bridge the gender gap." Without that ability, she says, AI cannot provide the predictive, common sense outputs that we've come to expect.

Right now, the idea of abstract training is in a very nascent stage. But if successful, Mitchell argues that it will not only create better, more valuable forms of AI but will simplify the training process itself because we will no longer need thousands upon thousands of data sets to convey relatively simple ideas and concepts.

Whether an intelligence is artificial or biological, training it is no easy task -- just ask any school teacher. After all, it takes a good 16 years or so to train the human brain to perform entry-level tasks at most companies these days.

AI can absorb a lot of information in a short period of time and then react accordingly to what it has learned, but this is a far cry from actual intelligence. Enterprise executives would do well to remember that no matter how much training an AI has received, or how smart it appears to be, it's still just an algorithm.

Teaching the bot

The wrong lesson?

More