https://www.servethehome.com/a-quick-look-huawei-hisilicon-kunpeng-920-arm-server-cpu/
 
 
Huawei HiSilicon Kunpeng 920 1

Recently, we had the opportunity to take a look at the Huawei TaiShan 200 or 2280 server. The server we were able to get had two Huawei/ HiSilicon Kunpeng 920 Arm server CPUs. Each had 48 Armv8 cores, although the family scales to 64 cores per processor. In the server, we saw a number of very interesting features that we wanted to point out when it comes to the CPU itself. After publishing that piece, several folks asked to see the actual chip, and that is what we have today.

A Quick Look at the Banned Kunpeng 920 Arm Server CPU

In January 2019, we covered the Huawei Kunpeng 920 64-Core Arm Server CPU with CCIX and PCIe Gen4 Launch. Just for some perspective, Intel was still selling its Skylake server CPUs (1st Generation Intel Xeon Scalable) as its newest. Cascade Lake was still almost two quarters from launch but it was shipping. The AMD EPYC 7002 Series “Rome” was still almost three-quarters away. To be clear, when the new chip was launched, it felt like an ark of new technology.

Huawei Kunpeng 920 Launch Cover

Huawei was using an Arm with up to 64 cores per socket. Intel was at 28 cores/ 56 threads and AMD was at 32 cores/ 64 threads in its still new x86 re-entry. It had 8 channels of DDR4 when the rest of the industry was at 6 channels (Intel) or 8 channels (AMD, but AMD “Naples” was very low volume.) Using PCIe Gen4 was not until Rome for AMD (2019) or Ice Lake for Intel (2021.) Beyond that, Huawei had 100GbE RoCE onboard. That was like the original idea with Zen/ Naples of having 10GbE, but just faster 25GbE links.

Huawei Kunpeng 920 Performance

Just for some sense, at SC18 in November 2018, Huawei’s booth had the HiSilicon Hi1616 on display in a number of servers. This was up to 32 cores and had only 1MB of L2 cache for 4 cores. It was also produced on 16nm. Frankly, this was a bit behind the Skylake Xeons of its day in terms of performance.

Huawei HiSilicon Hi1616 Placard At SC18

The Hi1616 was not just a placard. Instead, it was shown on the show floor in a number of different servers. We did not cover it at the time, but I looked through old photos, and I found a few shots.

Huawei HiSilicon Hi1616 At SC18

What we did not necessarily know, was that the Hi1620 would come as a 7nm data center Arm processor before others. Just for some sense here, AWS Graviton2 at 64 vCPUswas launched over a year after we saw this at SC18 and almost a year after the Kunpeng 920 / Hi1620 launch.

Huawei HiSilicon Hi1620 Placard At SC18

Huawei planned to use the new server CPUs in its TaiShan servers and has entire lines, like the TaiShan 200 (2280) that we just showed. The other main use case for this was in Huawei’s cloud.

Huawei TaiShan Servers

We did not hear much else about the Kunpeng 920 until later in 2019 at Hot Chips when we saw that the Huawei Ascend 910 Provides a NVIDIA AI Training Alternative.

Huawei Ascend 910 310 And Kunpeng 920

Fast forward to our recent piece, and we now have a dual socket TaiShan 200 (2280) working. We can see the two sockets and 48 cores here. We did not have the 64 core models in our server.

Huawei Kunpeng 920 2x 48c Lscpu Output

Here is the topology output.

Huawei Kunpeng 920 2x 48c Topology

Just as a quick note here, this is a bigger deal for Huawei than for companies like Dell EMC, or HPE. Huawei both sells servers, and operates its own cloud. As a result, years ago it was granted access to both the Intel OEM as well as the Intel cloud provider’s price lists. For those that do not know, over a decade ago, Intel created two price lists, one for public clouds and one for server OEMs. The public cloud providers received enormous discounts on SKUs, and often special parts. Certain companies were effectively on the honor system to properly report if CPUs were being purchased and used in the cloud (at a steep discount.) One can imagine how that could be troublesome without a solid process behind it. Even with access to both price lists, Huawei building its own server CPUs was a big deal. We often discuss AWS as the pioneer of Arm instances and betraying Intel’s high discount/ lower margin structure. One could argue that Huawei was there was well, it is just something we hear less of in the US.

Still, let us get to the hardware.

 

A Quick Look at the Huawei HiSilicon Kunpeng 920 Arm Server CPU

August 9, 2022

 

A Quick Look at the Kunpeng 920 Arm Server CPU

Here is a photo of the Kunpeng 920 powered Huawei TaiShan 2280. One can see that there are two CPUs under the heatsinks. Each CPU has 16 DIMM slots for 8 channel DDR4-2933 and 2 DIMM per channel operation (2DPC.)

Huawei TaiShan 200 2280 CPU And Memory Area 1

Removing the heatsink, we can see a quite large CPU. In this section, we wanted to show a lot more around the actual chips, and some size comparisons, so this is going to be heavier on pictures than words.

Huawei HiSilicon Kunpeng 920 Side With Thermal Paste Contact Area

Here is another angle. We can see that the chip is a BGA package. This was expected given the SC18 placard. It means that CPU is affixed to the motherboard and so this is not a socketed server.

Huawei HiSilicon Kunpeng 920 Side With Thermal Paste Contact Area 3

Instead of a socket, around the CPU there is a brace structure. This is to provide a rigid mounting point for the heatsink assembly, but it is not holding the CPU in place.

Huawei HiSilicon Kunpeng 920 Side With Thermal Paste Contact Area 2

At this point, and deliberately, we have left the thermal paste in place. Taking a look at the heatsink, we can see a matching contact patch.

Huawei HiSilicon Kunpeng 920 TaiShan 200 Heatsink Contact Area

This smaller contact patch is raised. The entire heatsink is not designed to contact the CPU, just the middle section. This is different from what we see on AMD EPYC and Intel Xeon processors. For some frame of reference on the size of the contact patch, here is an Intel Xeon E5-2600 V4 chip next to the raised contact patch, both with thermal paste.

Huawei HiSilicon Kunpeng 920 Heatsink Contact Area Next To Intel Xeon E5 2600 V4 1

Here is the chip without the thermal paste, albeit with some residue. Apologies for that. The Kunpeng 920 has its HiSilicon marking and giant logo on it.

Huawei HiSilicon Kunpeng 920 2

Getting to some size comparisons, here is a 3rd Generation Intel Xeon Scalable “Ice Lake” CPU next to the Kunpeng 920, both with their thermal paste applied.

Huawei HiSilicon Kunpeng 920 Next To Intel Ice Lake Xeon

Here is an AMD EPYC 7003X processor next to the Kunpeng 920.

Huawei HiSilicon Kunpeng 920 Next To AMD EPYC Milan X

Here is another shot, but the EPYC is resting atop the DIMMs.

Huawei HiSilicon Kunpeng 920 Next To AMD EPYC Milan X 2

The Ampere Altra Max is much more similar in terms of scale. It is a large CPU package, just like the Kunpeng 920.

Huawei HiSilicon Kunpeng 920 Next To Ampere Altra Max M128 30 1

Some suggested that we photoshopped the yellow on that image. Actually, it was just lit like that. We had a blue light in the other direction but we used the yellow pictures more in our coverage.

Huawei HiSilicon Kunpeng 920 Next To Ampere Altra Max M128 30 2

One can see that the Kunpeng 920 is maybe not as big as the Ampere Altra (Max) but it is noticeably bigger than the AMD EPYC 7003 (sans carrier) and the Intel Ice Lake Xeon.

Huawei HiSilicon Kunpeng 920 Next To Ampere Altra Max M128 30 10

Of course, we were able to do this with the Ampere, AMD, and Intel chips because they are socketed while the Kunpeng 920 had to remain affixed to its motherboard because it is soldered.

Final Words

In 2019 when the Huawei HiSilicon Kunpeng 920 was first launched, it was a class-leading 7nm CPU. In Q3 2022, it is still a CPU being used in systems and clouds, but it is also showing its age. Not only has Ampere moved ahead, but AMD and Intel are also well beyond this generation of technology. Alibaba with the Arm-based Alibaba Cloud T-Head Yitian 710 has perhaps the fastest single-socket CPU, for integer workloads, not using accelerators, around. When we first saw this Kunpeng 920, it was a culmination of all kinds of new technology that made it really interesting. It just took us another 30 months to get them.

Huawei Kunpeng 920 2.6GHz Performance

With the new CPU, we have been working to get it ready for prime-time, but it is challenging. The firmware seems to be better suited to Huawei workloads than a lot of what we are trying to run. We have started to get results, as we showed in our recent Solidigm D7-P5520 7.68TB PCIe Gen4 NVMe SSD review but getting to this level took about a day of work just playing with the system and its settings.

Solidigm SSD D7 P5520 Multiple CPU Architecture Performance Testing

This is one project that we will continue to work on, but expect to see figures from this server pop into more pieces as we get more confident in the results. So far, we are tracking that these CPUs on integer workloads, and not using accelerators, should be around that of a contemporary Cascade Lake 24-core processor.

Hopefully, this was a great look into a CPU that rarely gets seen. Since it is seen so infrequently in the US, many make assumptions about the chip that do not match reality. We hope our TaiShan 2280 piece as well as this one help to shed light on technology that is often referenced but rarely seen in the US and Canada.

[转帖]A Quick Look at the Huawei HiSilicon Kunpeng 920 Arm Server CPU的更多相关文章

  1. [转帖]VMware Vsphere 6.0安装部署 (三) vCenter Server安装

    VMware Vsphere 6.0安装部署 (三) vCenter Server安装 2016年08月29日 14:59:14 dAng1r0Us 阅读数:72942   版权声明:本文为博主原创文 ...

  2. [转帖]SQLSERVER2008R2 将于2019.7.9 结束支持服务 Windows server 2008r2 将于 2020.1.14 结束支持

    来源: https://cloudblogs.microsoft.com/sqlserver/2018/07/12/sql-server-2008-end-of-support-is-the-firs ...

  3. 【转帖】处理器史话 | 这张漫画告诉你,为什么双核CPU能打败四核CPU?

    处理器史话 | 这张漫画告诉你,为什么双核CPU能打败四核CPU? https://www.eefocus.com/mcu-dsp/371324 2016-10-28 10:28 作者:付丽华预计 9 ...

  4. 【转帖】漏洞数量242:15,英特尔和AMD CPU谁更安全?

    漏洞数量242:15,英特尔和AMD CPU谁更安全? http://www.eetop.cn/cpu_soc/6946340.html 越来越多的用户开始怀疑哪种处理器可以最好地保护他们的计算机,数 ...

  5. 自己家里搭建NAS服务器有什么好方案?

    转自:https://www.zhihu.com/question/21359049 作者:陈二发链接:https://www.zhihu.com/question/21359049/answer/6 ...

  6. 编译FFmpeg成一个SO库<转>

    转帖地址:http://www.ihubin.com/blog/android-ffmpeg-demo-3/ ============================================= ...

  7. FreeBSD 10 发布

    发行注记:http://www.freebsd.org/releases/10.0R/relnotes.html 下文翻译中... 主要有安全问题修复.新的驱动与硬件支持.新的命名/选项.主要bug修 ...

  8. SharePoint 2013 内容部署功能简介

    在之前的项目中,当客户有新的需求的时候,我们通常在测试环境上开发或者实施,然后手动在生产环境再弄一次.当发现内容部署这个东西,才知道这样是多么不合理的.我们可以创建两个网站集,一个用来修改,然后通过计 ...

  9. Darwin Streaming Server 6.0.3安装、订制、插件或模块

    How to setup Darwin Streaming Server 6.0.3 on 32 or 64 bit Linux platforms, add custom functionality ...

  10. Some current MySQL Architecture writings

    Posted on 19/09/2014 by Stewart Smith So, I’ve been looking around for a while (and a few times now) ...

随机推荐

  1. springsecurity 使用浅谈(一)

    1. 背景 springsecurity框架主要用于Web应用的认证和授权.所谓认证就是验证当前访问系统的是不是本系统的用户,并且要确认具体是哪个用户.而授权就是经过认证后判断当前用户是否有权 限进行 ...

  2. 《An End-to-end Model for Entity-level Relation Extraction using Multi-instance Learning》阅读笔记

    代码   原文地址   预备知识: 1.什么是MIL? 多示例学习(MIL)是一种机器学习的方法,它的特点是每个训练数据不是一个单独的实例,而是一个包含多个实例的集合(称为包).每个包有一个标签,但是 ...

  3. python异步编程之asyncio低阶API

    低阶API介绍 asyncio中低阶API的种类很多,涉及到开发的5个方面.包括: 获取事件循环 事件循环方法集 传输 协议 事件循环策略 本篇中只讲解asyncio常见常用的函数,很多底层函数如网络 ...

  4. java中synchronized和ReentrantLock的加锁和解锁能在不同线程吗?如果能,如何实现?

    java中synchronized和ReentrantLock的加锁和解锁能在不同线程吗?如果能,如何实现? 答案2023-06-21: java的: 这个问题,我问了一些人,部分人是回答得有问题的. ...

  5. 昇腾实战丨DVPP媒体数据处理视频解码问题案例

    摘要:本期就分享几个关于DVPP视频解码问题的典型案例,并给出原因分析及解决方法 本文分享自华为云社区<DVPP媒体数据处理视频解码问题案例>,作者:昇腾CANN . DVPP(Digit ...

  6. 智能对联模型太难完成?华为云ModelArts助你实现!手把手教学

    摘要:农历新年将至,听说华为云 AI 又将开启智能对对联迎接牛气冲天,让我们拭目以待!作为资深 Copy 攻城狮,想要自己实现一个对对联的模型,是不能可能完成的任务,因此我搜罗了不少前人的实践案例,今 ...

  7. 逼疯UE设计师,不可不知的提升产品用户体验的10个测试方法

    摘要:用户体验的描述比较主观,产品功能的可用性.可靠性.性能等都会影响用户的使用体验,比如功能bug问题也会说体验不好,程序崩溃也会说体验不好,性能卡顿会说体验不好,那是不是都在用户体验测试的范围呢? ...

  8. 遇到联邦计算数据碰撞难题怎么办?不妨试一试PSI

    摘要:随着MPC.隐私计算等概念的流行,诸多政府机构.金融企业开始考虑参与到多方计算的场景中,扩展数据的应用价值. 本文分享自华为云社区<使用PSI解决联邦计算的数据碰撞问题>,作者:br ...

  9. Go 1.18 新特性:多模块工作区模式

    摘要:在 Go 1.18 推出多模块工作区模式--Multi-Module Workspaces,用以支持模块的多个工作空间,我们来看看到底有什么特别. 本文分享自华为云社区<一起看看 Go 1 ...

  10. 技术架构+应用场景揭秘,为什么高斯Redis比开源香?

    摘要:高斯Redis即保留了开源Redis的能力,同时凭借其存算分离的架构,在成本.稳定性.可靠性.一致性等方面做出了新的突破,也更加适用于当下数据规模庞大的互联网业务. 本文分享自华为云社区< ...