Join top executives in San Francisco on July 11-12, to hear how leaders are integrating and optimizing AI investments for success. Learn More
Arm unveiled the performance numbers for its Arm Neoverse V1 and N2 server chip platforms, with processing boosts ranging from 40% to 50% over the previous generation.
The demands of datacenter workloads and internet traffic are growing exponentially, and new solutions are needed to keep up with these demands while reducing the current and anticipated growth of power consumption. But Arm said the variety of workloads and applications being run today means the one-size-fits all approach to computing is no longer the answer. That’s a jab at dominant server vendors Intel and Advanced Micro Devices, which use the x86 architecture.
The Arm Neoverse V1 is a server chip microarchitecture that Arm’s customers — the big chipmakers of the world — can design chips around for servers in the big datacenters that power the internet. The V1 supports Scalable Vector Extension (SVE) and delivers more than 50% performance increases for high-performance computing machine learning workloads.
“The time for Neoverse across all infrastructure is now,” Chris Bergey, senior VP for the infrastructure line of business at Arm, said in a press briefing.
Join us in San Francisco on July 11-12, where top executives will share how they have integrated and optimized AI investments for success and avoided common pitfalls.
And another chip microarchitecture, the Arm Neoverse N2 platform, uses the new Armv9 architecture that Cambridge, United Kingdom-based Arm recently announced. It can deliver 40% more performance for a variety of workloads.
“I think the N2 will pleasantly surprise people how performant designs will be in single-threaded designs,” said Patrick Moorhead, an analyst at Moor Insights & Strategy. “V1 looks to be a strong start in a nichey market, HPC. Overall, Arm is raising its game in the compute market.”
Bergey said the journey to producing competitive server chips began a decade ago. Chips based on the designs should be hitting the market either late this year or early next year.
Arm said the Arm Neoverse CMN-700 is the industry’s most advanced mesh interconnect to unleash the performance and performance/watt benefits of Neoverse V1 and N2 platforms. This device is a key element for constructing high-performance Neoverse V1 and Neoverse N2-based systems-on-chip (SoCs). It enables higher core counts and cache memory sizes.
As Moore’s law comes to an end, solution providers are seeking specialized processing. Enabling specialized processing has been a focal point since the inception of the Neoverse line of platforms, and Arm expects these latest additions to accelerate this trend.
Back in September, Arm unveiled the new Neoverse N2 and Neoverse V1 platforms without talking about performance. Now the company is talking about the performance per watt, the total cost of ownership benefits, and partners adopting the designs.
“We believe Arm processors are coming to servers in a big way. We believe Arm is actually going to be everywhere, from the edge to the cloud,” Oracle senior VP Bev Crair said in a press briefing.
Among the customers:
- Marvell revealed its Octeon family of networking solutions based on Neoverse N2 will begin sampling by end of 2021, providing a 3 times performance uplift over previous generation Octeon chips.
- India’s Ministry of Electronics and Information Technology (MeitY) has announced it will join SiPearl and ETRI in licensing Neoverse V1 for its national exascale HPC project.
- Oracle plans to adopt Ampere Altra central processing units (CPUs) for Oracle Cloud Infrastructure, as the price/performance leader across a wide range of workloads.
- Arm-powered Amazon Web Services Graviton2 continues to rapidly expand its EC2 footprint with steady growth and regional expansion.
- Alibaba Cloud just tested the upcoming Alibaba Cloud ECS Arm instances, showing off an improved performance of the DragonWell JDK on Arm by 50%.
- Tencent is making investments in hardware testing and on software enablement that will allow it to adopt Neoverse technology for cloud applications. Bergey said the tests are showing great performance per watt for the Arm-based designs.
- Nvidia’s Grace is using an unannounced Arm processor, but Arm didn’t say if the new Neoverse designs are being used in Grace.
These partners are taking full advantage of what is under the hood of Neoverse platforms. This is just the tip of the iceberg for infrastructure workload benefits and how partners plan to implement and take Neoverse IP to market, Bergey said. Arm argues that innovators shouldn’t have to choose between performance and power efficiency.
The chips can target a range of cloud-to-edge uses.
“The Neoverse V1 and N2 are huge improvements for Arm,” Tirias Research analyst Kevin Krewell said in an email to VentureBeat. “The V1 with the Scalable Vector Extensions (SVE) [is] powerful enough to be the CPU core for supercomputers. Even though Arm didn’t provide performance numbers against AMD and Intel, it seems to be very competitive, based on Arm’s data. The N2 is not an insignificant improvement over the N1. It’s the core to use for designs with very high core count, trading off some performance and a narrower SVE implementation for a smaller core size and lower power. These improvements are in line with Nvidia’s goals for the Arm architecture in the datacenter, and one of these cores could well be the core used in Nvidia’s Project Grace CPU.”
Linley Gwennap, principal analyst at the Linley Group, said in an email that third-party test results are telling.
“AMD’s latest Epyc processor outperforms the fastest Neoverse N1 chip on almost every test, often by a wide margin, despite the Arm chip having more cores,” Gwennap said. “Even after adjusting for TDP, the two chips have about the same performance per watt. Arm’s superiority claims rely on synthetic benchmarks that scale ideally across 64 or more cores, which isn’t representative of the real workloads that Phoronix measures. I also estimate that AMD Zen 3 leads the N1 by 60% on single-thread applications.”
He added, “If you take this N1 comparison and project based on Arm’s data, the N2 will still be about 20% behind Zen 3 for single-thread (scale-up) workloads. According to Arm, N2 power rises by more than its performance, so power efficiency is actually worse for N2 (and much worse for V1). So if AMD matches N1 on performance per watt, N2 won’t give Arm the lead on that metric. In summary, until Arm achieves parity in single-thread performance, it will be limited to scale-out workloads. And unless it can demonstrate a sizable advantage in performance per watt on real applications, its main selling point is lower prices.”
The Neoverse V1
This chip design delivers a 50% uplift, as well as a 1.8 times improvement for a range of vector workloads and a 4 times improvement for machine learning workloads over N1.
Neoverse V1 is the first in a new performance-first computing tier for Arm. Neoverse V1 gives silicon partners the flexibility to build compute for applications more reliant on CPU performance and bandwidth while providing system-on-chip (SoC) design flexibility.
The performance-first design philosophy behind Neoverse V1 was to build the widest microarchitecture Arm has ever produced to accommodate more instructions in flight in support of markets like high performance and exascale computing. The wide and deep architecture — with the addition of scalable vector extensions (SVE) — gives Neoverse V1 the lead in per-core performance, as well as code longevity with SVE, and provides SoC designers implementation flexibility, Arm said.
You can see the benefits of some of these design elements in SiPearl and ETRI’s HPC SoCs, and Arm thinks this is the direction HPC compute is heading, Bergey said.
The Neoverse N2
The Neoverse N2 is aimed at cloud-to-edge performance. A few weeks ago, Arm introduced the Armv9 architecture to address global demand for ubiquitous specialized processing. The Neoverse N2 platform is the first based on the Armv9 architecture with improvements to security, power efficiency, and performance.
Delivering 40% higher single-threaded performance compared to N1, Neoverse N2 still retains the same level of power and area efficiency as Neoverse N1. The scalability of Neoverse N2 extends from high-throughput computing, such as in hyperscale cloud, where Arm sees 1.3 times improvement on NGINX over N1.
The Neoverse N2 platform delivers superior performance per thread and industry-leading performance per watt, driving a reduced total cost of ownership for users. Neoverse N2 is the first platform to feature SVE2, an Armv9 feature that drives a significant uplift in cloud-to-edge performance efficiency.
For a broader set of use cases, like machine learning, digital signal processing, multimedia, and 5G systems, SVE2 brings performance and ease of programming, as well as the portability benefits of SVE.
VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.