Tenstorrent reveals Grayskull, an all-in-one system that accelerates AI model training

Tenstorrent, an AI and machine learning hardware startup based in Toronto, Canada, today emerged from stealth with over $34 million in funding and an all-in-one computer system dubbed Grayskull. Cofounder and CEO Ljubisa Bajic, a former Nvidia senior architect who previously served as director of integrated circuit design at AMD, claims that Grayskull's architecture eliminates unnecessary computation to deliver a performance improvement on today's most-used AI models, allowing data scientists to train sophisticated AI without having to pay through the nose for cloud-hosted resources.

Grayskull, which Bajic says is sampling with partners and is anticipated to be production-ready in fall 2020, enables what's called conditional execution, a technique that facilitates faster AI inference and training and supports the scaling of workloads from datacenters to edge devices. The system features 120 of Tenstorrent's proprietary Tensix cores, each of which comprises a high-utilization packet processor, a programmable single instruction multiple data (SIMD) processor, a dense math computational block, and five reduced instruction set computer (RISC) cores. They're stitched together with a custom torus interconnect, a switch-less network topology for efficiently connecting processing nodes in parallel.

Grayskull pairs the Tensix array with 120MB of local SRAM and eight channels of LPDDR4 supporting up to 16GB of external RAM (across 16 lanes of PCI-E Gen 4). When it enters production later this year, it'll have a rival in Qualcomm's Cloud AI 100 edge computing card, which maxes out at "far greater" than 100 trillion operations per second (TOPs); Tesla, which last April detailed a Samsung-manufactured chipset featuring over 144 TOPs; and Baidu, whose newest Kunlun AI accelerator delivers up to 260 TOPs.

But Grayskull might have a significant performance advantage. In preliminary experiments, Bajic claims the system hit 368 TOPs at the "chip thermal design power set point required for a 75W bus-powered PCIE card." And with conditional execution, Grayskull has been observed processing up to 23,345 sentences per second using Google's BERT-Base model for the SQuAD 1.1 data set, giving it a 26 times performance advantage over competing solutions.

Tenstorrent plans to target datacenters, public and private cloud servers, on-premises servers, edge servers, and automotive and other markets. Bajic is scheduled to reveal more at a talk during this year's virtual Linley Spring Processor Conference.

Four-year-old Tenstorrent's other cofounders include Ivan hammer, a former embedded engineer at AMD, and Milos Tajkovic, previously an AMD firmware design engineer. In addition to its Toronto headquarters, the company has offices in Austin, Texas and Silicon Valley. It's backed by Eclipse Ventures and Real Ventures, among other investors, who contributed $12.5 million to its series A round in August 2017 and $20.7 million to its series B round in February.

More