Jeff Hawkins and Donna Dubinsky started Numenta nine years ago to create software that was modeled after the way the human brain processes information. It has taken longer than expected, but the Redwood City, Calif.-based startup recently held an open house to show how much progress it has made.
Hawkins and Dubinsky are tenacious troopers for sticking with it. Hawkins, the creator of the original Palm Pilot, is the brain expert and co-author of the 2004 book “On Intelligence.” Dubinsky and Hawkins had previously started Handspring, but when that ran its course, they pulled together again in 2005 with researcher Dileep George to start Numenta. The company is dedicated to reproducing the processing power of the human brain, and it shipped its first product, Grok, earlier this year to detect unusual patterns in information technology systems. Those anomalies may signal a problem in a computer server, and detecting the problems early could save time.
While that seems like an odd first commercial application, it fits into what the brain is good at: pattern recognition. Numenta built its architecture on Hawkins’ theory of Hierarchical Temporal Memory, about how the brain has layers of memory that store data in time sequences, which explains why we easily remember the words and music of a song. That theory became the underlying foundation for Numenta’s code base, dubbed the Cortical Learning Algorithm (CLA). And that CLA has become the common code that drives all of Numenta’s applications, including Grok.
Hawkins and Dubinsky said at the company’s recent open house that they are more excited than ever about new applications, and they are starting to have deeper conversations with potential partners about how to use Numenta’s technology. We attended the open house and interviewed both Hawkins and Dubinsky afterward. Here’s an edited transcript of our conversations.
Above: Numenta cofounders Donna Dubinsky and Jeff Hawkins
Image Credit: Dean Takahashi
VentureBeat: I enjoyed the event, and I was struck by a couple of things you said in your introduction. Way back when, you wrote the book on intelligence. You started Numenta. You said that you’d been studying the brain for 25 years or so. It seemed to me that you knew an awful lot about the brain already. So I was surprised to hear you say that we didn’t know much of anything at all about the way the brain works.
Jeff Hawkins: Well, in those remarks I gave at the open house, I meant to say that I’d been working at this for more than 30 years, and that when I started, all those years ago, we knew almost nothing about the brain. But it wasn’t that we’ve known nothing in the last 10 years or something. Tremendous progress has been made in the last 30 years.
VB: At the beginning of Numenta, if you look back at what you knew then and compare it to what you know now, what’s different now?
Hawkins: If you go back to the beginning of Numenta, our state of understanding was similar to what I wrote about in On Intelligence. That’s a good snapshot. If you look in the book, you’ll see that we knew the cortex is a hierarchy of thinking. We knew that we had to implement some form of sequence memory. I wrote about that. We knew that we had to form common representations for those sequences. So we knew a lot of this stuff.
What we didn’t know is exactly how the cortex learns and does things. It was like, yeah, we’ve got this big framework, and I wrote about what goes into the framework, but getting the details so you can actually build something, or understand exactly how the neurons are doing this, was very challenging. We didn’t know that at the time. There were other things, like how sensory, motor, these other systems work. But the big thing is we didn’t have a theory that went down to a practical, implementable, testable level.
Above: Numenta cofounder Jeff Hawkins
Image Credit: Dean Takahashi
VB: I remember from the book, you had a very interesting explanation of how things like memory work. You said that you could remember the words to a song better because they were attached to the music. The timing of the music is this kind of temporal memory. Is that still the case as far as how you would describe how the brain works, how you can recall some things more easily?
Hawkins: The memory is a temporal trait, a time-based trait. If you hear one part of something you’ll recognize the whole thing and start playing it back in your head. That’s all still true. Again, we didn’t know exactly how that worked.
It turns out that, in the book, I wrote quite a bit about some of the attributes that this memory must have. You mentioned starting in the middle of a song and picking it up. Or you can hear it in a different key or at a different tempo. We didn’t know exactly how we do that.
When we started Numenta, we had a list of challenges related to sequenced memory and prediction and so on. We worked on them for quite a few years, trying to figure out how you build a system that solves all these constraints. That’s very difficult. I don’t think anyone has done it. We worked on it for almost four years until we had a breakthrough, and then it all came together in one fell swoop, in just a matter of a few weeks. Then we spent a lot of time testing it.
VB: Can you describe that platform in some way, this algorithm?
Hawkins: The terminology we use is a little challenging. HTM refers to the entire overall theory of the cortex. You can take what’s in the book as HTM theory. I didn’t use the term at the time. I called it the memory prediction framework. But I decided to use a more abstract term, “hierarchical temporal memory.” It’s a concept of hierarchy, sequenced memory, and all the concepts that play into that theory.
The implementation, the details of how cells do that – which is a very detailed biology model, plus a computer science model – that we gave a different name to. We call that the Cortical Learning Algorithm. It’s the implementation of cells as the components of the HTM framework. It’s something you can reduce to practice. The CLA, you can describe that. Many people have created that, and it works. While the HTM is a useful framework for understanding what the whole thing is about, it doesn’t tell you how to build one. It’s the difference between saying “We’re going to invent a car that has wheels, a motor, consumes gasoline, and so on” – that’s the HTM version – and figuring out how to build an internal combustion engine that really works and that someone can build.
VB: When you talk about the algorithm there, what are you simulating? Is it a very small part of what the brain does?
Hawkins: If you look at the neocortex, it looks very similar everywhere. Yet it has some basic structure. The first basic structure you look at in any neocortex in any animal, you see layers of cells. The layers you see are the same in dogs, cats, humans, mice. They look similar everywhere you go.
What the CLA captures is how a layer of cells can learn. It can learn sequences of different types of things. It can learn sequences of motor behaviors. It can learn sequences of sensory inputs. We’re modeling a section of the layer of cortex. But we’re only modeling a very tiny piece of one. We’re modeling 1,000 to 5,000 nerve cells. But it’s the same process used everywhere. Our models are fairly small, but the theory covers a very large component of what the cortex has to do – the theory of cells in layers. We also now understand how those layers interact. That’s not in the CLA per se, but we now understand how multiple layers interact and what they’re doing with each other. The CLA specifically deals with what a layer of cells does. But we think that’s a pretty big deal.
Above: Numenta’s Grok
Image Credit: Numenta
VentureBeat: It sounded like, when you guys were talking at the outset, that it took longer than you expected to get a commercial business out of all the ideas that you started with.
Donna Dubinsky: That’s fair to say. We knew it would be hard, but I don’t think we anticipated it would take so long to get the first commercial product out there. It’s taken a long time to go through multiple generations of these algorithms and get them to the point where we feel they’re performing well.
VB: Could you explain that, then? The underlying platform is the algorithm. What exactly does it do? It functions like a brain, but what are you feeding into it? What is it crunching?
Dubinsky: It’s modeled after what we believe are the algorithms of the human brain, the human neocortex. What your brain does, and what our algorithms do, is automatically find patterns. Looking at those patterns, they can make predictions as to what may come next and find anomalies in what they’ve predicted. This is what you do every day walking down the street. You find patterns in the world, make predictions, and find anomalies.
We’ve applied this to a first product, which is the detection of anomalies in computer servers under the AWS environment. But as much as anything it’s just an example of the sort of thing you could do, as opposed to an end in itself. We’ve shown several examples. We’re keen on demonstrating how broadly applicable the technology is across a bunch of different domains.
VB: Are there some benefits to taking longer to get to market that you can cash in on? There are things like Amazon Web Services or the cloud that don’t exist when you started the company.
Dubinsky: It’s a good point. Certainly having AWS has been fantastic for us. AWS takes and packages the data in exactly the way that our algorithm likes to read it. It’s streaming data. It flows over time. It’s in five-minute increments. That’s a great velocity for us. We can read it right into our algorithm.
Over the years we’ve talked to lots of companies who want to use this stuff, and their data are simply not in a streaming format like that. Everyone talks about big data. The way I think about that is big databases. Everyone’s taking all this data and piling it up in databases and looking for patterns. That’s not how we do it at all. We do it on streaming data, moving over time, and create a model.
We don’t even have to keep the data. We just keep the last couple thousand data points in case the system goes down or something. But we don’t need to keep the data. We keep the model. That’s substantially different from what other people do in the big data world. One of the reasons we went to AWS was it got us around that problem. It was very easy to connect up our product to AWS.
VB: It almost seems that with the big data movement, a lot of corporations are thinking about tackling bigger problems than before. They seem to need more of the kinds of things that you do.
Dubinsky: More and more people are instrumenting more and more things. When I think about where we’re going to fit, it gets closer to the sensors. You have more and more sensors generating information, including people walking around with mobile devices. Those are nodes in a network, in some sense. The more this data comes in to companies and individuals, the more they’re going to need to find patterns in it.
People don’t know what to do with all this data. When you go read all the internet-of-things stuff and ask the people who write it, “How are you going to use the data that this thing on the internet is generating?” they don’t really have good answers for you.
VB: It seemed like the common thread among them was pattern recognition, which is what the brain is good at, right?