The technology provides tremendous flexibility as designers seek to “mix and match” intellectual property blocks with various memory and I/O elements in new device form factors. It will allow products to be broken up into smaller “chiplets,” where I/O, SRAM (a kind of fast memory), and power delivery circuits can be fabricated in a base die, and high-performance logic chiplets are stacked on top.
Intel expects to launch a range of products using Foveros, beginning in the second half of 2019. The first such product will combine a high-performance 10-nanometer compute-stacked chiplet with a low-power base. It will enable the combination of world-class performance and power efficiency in a small form factor, Intel said.
Foveros is the next leap forward, following Intel’s breakthrough Embedded Multi-die Interconnect Bridge (EMIB) 2D packaging technology that was introduced in 2018.
Sunny Cove CPU architecture
Intel also showed off Sunny Cove, its next-generation CPU microarchitecture designed to increase performance per clock and power efficiency for general-purpose computing tasks. Sunny Cove includes new features to accelerate special purpose computing tasks, like AI and cryptography. Intel has previously talked about 14-nanometer chips, like Cascade Lake and Cooper Lake.
Sunny Cove will be the basis for Intel’s next-generation server (Intel Xeon) and client (Intel Core) processors later next year. Sunny Cove features include improved designs that allow it to execute more operations at the same time, in parallel. Ronak Singhal, an Intel fellow, said that Sunny Cove finds ways to make processing wider, deeper, and smarter, with more work done in parallel and larger caches to improve latency.
It will also have new algorithms to reduce latency, or interaction delays; increased size of key buffers and caches to optimize data-centric workloads; and architectural extensions for specific use cases and algorithms.
For example, Sunny Cove has performance-boosting instructions for cryptography, such as vector AES and SHA-NI, and other critical use cases, like compression and decompression.
Sunny Cove also enables reduced latency and high throughput, as well as offering much greater parallelism, which is expected to improve experiences from gaming to media to data-centric applications. A follow-up chip, Willow Cove, will also arrive in 2019 with more security features, and another one, Golden Cove, will arrive in 2021.
Meanwhile, Intel continues to work on its low-end processor architecture, Atom, for devices that need good battery life. One chip, Tremont, will debut in 2019. Another, Gracemont, is coming in 2021, and Nextmont should arrive in 2022 or 2023.
Koduri also said that Intel is continuing to pursue high-end AI chip designs and field programmable gate arrays (FPGAs).
Gen11 integrated graphics
Intel unveiled new Gen11 integrated graphics with 64 enhanced execution units, more than double previous Intel Gen9 graphics (yes, for some reason Gen10 wasn’t a good comparison to make), that are designed to break the teraflop barrier. The new integrated graphics will be delivered in 10-nanometer processors beginning in 2019.
The new integrated graphics architecture is expected to double the computing performance-per-clock compared to Intel Gen9 graphics. The integrated graphics are considered part of another chip, like the Intel processor, while discrete graphics like the kind Nvidia or AMD make are separate chips. While discrete graphics processing units (GPUs) are sexy, Intel fellow and chief GPU architect David Blythe noted that a billion consumers use Intel integrated graphics today. Making games work better on those machines is a priority for Gen11, Blythe said.
With teraflop performance capability, this architecture is designed to increase game playability. At the event, Intel showed Gen11 graphics nearly doubling the performance of a popular photo recognition application when compared to Intel’s Gen9 graphics. Gen11 graphics is expected to also feature an advanced media encoder and decoder, supporting 4K video streams and 8K content creation in constrained power envelopes.
Gen11 will also feature Intel Adaptive Sync technology, enabling smooth frame rates for gaming. (Both AMD and Nvidia have had this for a while, but Intel is adding it to base systems that sell on the order of 200 million units a year.)
Intel also reaffirmed its plan to introduce a discrete graphics processor, dubbed Xe, by 2020. Koduri said it will be good but that Intel isn’t going to comment further yet. This, of course, is what we all want to hear about, as it will redefine competition with graphics rivals AMD and Nvidia. There will be two variants of Xe: one for the datacenter and one for client platforms, Koduri said.
“One API” software
Intel announced the “One API” project to simplify the programming of diverse computing engines across CPU, GPU, FPGA, AI, and other accelerators. The project includes a comprehensive and unified portfolio of developer tools for mapping software to the hardware that can best accelerate the code. A public project release is expected to be available in 2019.
Koduri said that Intel has more than 15,800 software programmers and that they are working on this technology so that “no transistor is left behind.” That sounds good, though I’m not sure exactly what he meant.
“We have gotten a huge religion around software in the last 12 months,” Koduri said.
Better memory and storage
Intel discussed updates to Intel Optane technology (a 3D Xpoint memory) and the products based upon that technology that aims to close the gap between slow, permanent storage devices, like solid-state drives (SSDs) and hard drives, and faster, impermanent memory, such as dynamic random access memory (DRAM).
Intel Optane DC persistent memory is a new product that converges memory-like performance with the data persistence and large capacity of storage, said Frank Hady, chief architect for Optane products. The technology brings more data closer to the CPU for faster processing of bigger datasets, like those used in AI and large databases. Its large capacity and data persistence reduces the need to make time-consuming trips to storage, which can improve workload performance.
The company also showed how SSDs based on Intel’s 1 terabit QLC NAND die move more bulk data from hard disks to SSDs, allowing faster access to that data.
The combination of Intel Optane SSDs with QLC NAND SSDs will enable lower-latency access to data used most frequently. Taken together, these platform and memory advances complete the memory and storage hierarchy, providing the right set of choices for systems and applications.
Feeding the beast
This kind of new memory is what you need in order to keep feeding data to “the beast,” the main processor or FPGAs or GPUs.
Recapping the day, Renduchintala said Intel’s product vision is to deliver a mixture of architectures — including scalar, vector, matrix, and spatial — “exquisite solutions, fed by disruptive memory hierarchies.” Intel will bring in the new technologies, like 3D packaging, when they are ready and will iterate on them until the whole system works better.
Renduchintala said that Intel isn’t just in the CPU business, that it’s more like “XPU,” where the “X” could refer to a CPU, a GPU, an FPGA, or something else.
“I summarize this as the Intel advantage,” he said.