Flex Logix unveils neural inferencing engine for AI in datacenters and on the edge

Chip maker Flex Logix today introduced its new Nmax general purpose neural inferencing engine designed for AI deployment in a number of environments with popular machine learning frameworks like TensorFlow or Caffe. Multiple neural accelerators can be grouped together to achieve higher levels of throughput.

Flex Logix says configurations of its Nmax 512 tiles can outperform other datacenter inferencing products like the Nvidia Tesla T4 when processing batches from image classification benchmark ResNet50. The accelerator was also made to carry out real-time detection of multiple objects with offerings like YOLOv3 to meet growing demand for visual processing on the edge.

"With our architecture, since we load weights very quickly, we get high performance even if batch size equals one, so we're good in the datacenter but we're excellent at the edge," Flex Logix CEO and cofounder Geoff Tate told VentureBeat in a phone interview.

The Nmax series made its debut today at the Linley Processor Conference in Santa Clara, California. It's in production now, and Tate says Nmax engines will be available in late 2019.

The Nmax engine is a departure from previous work by Flex Logix that has focused primarily on specialized embedded field programmable gate array (FPGA) chips for specific tasks for customers like Harvard University, DARPA, and Boeing.

Nmax uses interconnect technology like the kind used in FPGA chips but is a general purpose neural inferencing engine programmed with TensorFlow and designed to run any kind of neural network.

Flex Logix raised a $5 million funding round in May 2017 to explore ways to build more flexible chips.

In addition to being able to quickly process visual information, Tate says, the Nmax maintains lower DRAM bandwidth rates and higher MAC efficiency than many of its competitors, leading to lower levels of energy consumption.

Raw compute power may get a lot of attention, but energy efficiency is another vital part of what it takes to train AI systems, Tate said.

"Whatever people are doing now, in five years, these models are just going to keep getting bigger and more complicated, which means we're going to have to be more tera operations per second (TOPS), but the power constraint won't change. So there's going to be continuing pressure to lower the TOPS per watt in order for the market to expand. If people can't do it, the market won't expand. Of course there's also a cost component, but you need lower costs and lower power, both, to penetrate many of these applications," Tate told VentureBeat in a phone interview.

Flex Logix was founded in 2014 by Tate and cofounder Cheng Wang and is based in Mountain View, California.

More