Kneron takes aim at GPU shortage with its neural processing unit (NPU) update

With concerns about a global shortage of GPUs for AI, edge AI startup Kneron sees an opportunity for its neural processing unit (NPU) technology as a competitive alternative.

Kneron today is announcing its latest KL730 NPU, with the company claiming that it offers up to four times more energy efficiency than its prior models. The new chip is also purpose built to help accelerate GPT, transformer-based AI models.

Kneron's silicon is largely targeted at edge applications, such as autonomous vehicles and medical and industrial applications, although the company also sees potential for enterprise deployments. Kneron benefits from the backing of Qualcomm and Foxconn and has deployments with Quanta in edge servers.

"An NPU has more cores compared with a GPU," Kneron founder and CEO Albert Liu told VentureBeat. "The cores are more efficient and they are more focused with nuanced connectivity.

The technology inside Kneron's NPUs

Liu argued that a GPU is not a purpose-built device for AI.

"GPU hardware was specifically designed for gaming, and right now it's just Nvidia trying to brainwash all of us trying to say that only a GPU can do AI," said Liu.

Nvidia's GPU technology is, of course, market leading and is the basis on which modern large language models (LLMs) and generative AI are built. Liu doesn't think it will always be that way, he said, and he's hopeful his company will carve out an expanded market footprint as organizations increasingly look for ways to meet AI demands.

Kneron's chips use a reconfigurable AI architecture to accelerate AI, which is a different architecture than what is used in a GPU. With the KL730, the architecture has also been specifically optimized for GPT's transformer-based AI models.

Kneron well-established in the NPU market

The KL730 isn't Kneron's first chip optimized for transformers — the company announced the KL530 silicon two years ago, which had that capability. The original use case for the transformer model in Kneron's silicon was to help autonomous vehicle manufacturers. Liu said that transformer models can be very helpful with real time temporal correlation detection use cases.

What wasn't clear in 2020, at least to Liu, was that transformers would become widely used for enabling LLMs and generative AI. To help meet the needs of LLMs, Liu said that his company has made its AI chip larger for GPT style applications.

"The reconfigurable AI architecture can dynamically change the structure inside the chip to support almost any kind of new model," Liu said.

The cascading power of the KL730

With the new KL730, Kneron has made some dramatic performance improvements to its NPU silicon.

Liu said that the KL703 has better performance than prior generations and can also be clustered. As such, if a single chip isn't enough for a specific use case, multiple KL703s can be clustered together in a larger deployment.

While Kneron's silicon is largely used for inference use cases today, Liu is hopeful that the ability to combine multiple KL730s together will enable broader use of the technology for machine learning (ML) training as well.

"For server applications, Kneron already has customers like Naver, Chunghwa Telecom and Qua nta," said Liu. "Foxconn is one of our strategic investors and they are closely working with us for AI servers."

The technology inside Kneron's NPUs

Kneron well-established in the NPU market

The cascading power of the KL730

More