[转帖]TaiShan v110 - Microarchitectures - HiSilicon
https://en.wikichip.org/wiki/hisilicon/microarchitectures/taishan_v110
| Edit Values | |
| TaiShan v110 µarch | |
| General Info | |
| Arch Type | CPU |
| Designer | HiSilicon |
| Manufacturer | TSMC |
| Introduction | 2019 |
| Process | 7 nm |
| Core Configs | 32, 48, 64 |
| Pipeline | |
| Type | Superscalar, Superpipeline |
| OoOE | Yes |
| Speculative | Yes |
| Reg Renaming | Yes |
| Decode | 4-way |
| Instructions | |
| ISA | ARMv8.2-A |
| Extensions | NEON |
| Cache | |
| L1I Cache | 64 KiB/core |
| L1D Cache | 64 KiB/core |
| L2 Cache | 512 KiB/core |
| L3 Cache | 1 MiB/core |
| Succession | |
|
|
|
TaiShan v110 is the successor to the TaiShan v100, a high-performance ARM server microarchitecture designed by HiSilicon for Huawei's own TaiShan servers.
contents
[hide]
Brands[edit]
TaiShan-based CPUs are branded as the Kunpeng 920 series.
Release Dates[edit]
Kunpeng 920 CPUs were officially launched in early 2019.
Architecture[edit]
Overview
Key changes from TaiShan v100[edit]
- TSMC 7 nm HPC process (from 16 nm)
- 2x core count (64, up from 32)
- Custom cores (from Cortex-A72)
- ASIMD
- double SP Vector throughput (2 inst/cycle, up from 1)
- ASIMD
- Custom cores (from Cortex-A72)
- Memory
- 2x memory channels (8, up from 4)
- I/O
- PCIe Gen 4 (from Gen 3)
This list is incomplete; you can help by expanding it.
Block Diagram[edit]
Entire Chip[edit]
Memory Hierarchy[edit]
- Cache
- L1I Cache
- 64 KiB/core, private
- 64-byte cache lines
- L1D Cache
- 64 KiB/core, private
- 64-byte cache lines
- L2 Cache
- 512 KiB/core, private
- L3 Cache
- 1 MiB/core
- Shared by all cores
- System DRAM
- 1 TiB Max Memory / socket
- 8 Channels
- DDR4, up to 2933 MT/s
- 1 DPC and 2 DPC support
- 8 B/cycle/channel (@ memory clock)
- ECC, SDDC, DDDC
- L1I Cache
Overview[edit]
Overview
Though HiSilicon has a history of designing Arm processors. The TaiShan v110 core is HiSilicons' first custom homegrown high-performance ARM core and SoC design. The chip, which incorporates multiple compute dies and an I/O is a multi-chip package, is fabricated on TSMC's 7-nanometers HPC process and integrates up to 64 cores and up to 64 MiB of last level cache.
The SoC also incorporates a number of hardware accelerators. There is a crypto engine that supports AES, DES/3DES, MD5, SHA1, SHA2, HMAC, CMAC with throughputs of up to 100 Gbit/s. Additionally, there is also a compression engine supporting GZIP, LZS, LZ4 with compression throughputs of up to 40 Gbit/s and decompression of up to 100 Gbit/s.
Marketed as the Kunpeng 920, this SoC supports up to 4-way multiprocessing support through HiSilicon's Hydra interface. In order to keep the cores fed, eight DDR4 memory channels are incorporated per socket. Additionally, designed to facilitate an easy accelerator platform, there are 40 PCIe Gen 4 lanes provided per socket with CCIX support, enabling cache coherency.
Core[edit]
Each core is a 4-way out-of-order superscalar that implements the ARMv8.2-A ISA. Huawei stated that the core supports almost all the ARMv8.4 features with a few exceptions, including dot product and the FP16 FML extension. It features private 64 KiB L1 instruction and data caches as well as 512 KiB of private L2. Though light on details, Huawei says that compared to Arm's Cortex cores, their core features an improved memory subsystem, a larger number of execution units, and a better branch predictor.
ASIMD[edit]
Each core features a single 128-bit NEON unit. It is capable of executing single double-precision FMA vector instruction per cycle or two single-precision vector instructions per cycle. Operating at 2 GHz, a 64-core chip will have a peak compute of 512 GigaFLOPS of double-precision floating point. It's worth noting that compared to the TaiShan v100, the throughput for single-precision vector has been doubled from 1 to 2 instructions per cycle.
MCP physical design[edit]
The SoC itself comprises 3 dies - two Super CPU Cluster (SCCL) compute dies and a Super IO Cluster (SICL). The SCCL compute dies contains 8 CPU Clusters (CCLs), memory controllers, and the L3 cache block. There are eight CCLs on each of the SICL dies for a total of 64 cores. The CCLs are TaiShan V110 quadplex along with the L3 cache tags partition. The Super IO Clusters include the various I/O peripherals including PCIe Gen 4, SAS, the network interface controllers, and the Hydra links.
Scalability[edit]
- See also: Hydra Interface
Each chip incorporates three Hydra interface ports. The Hydra interface facilitates the cache coherency between the dies on the chip. Every link supports 240 Gb/s (30 GB/s) of peak bandwidth for a total aggregated bandwidth of 720 Gb/s (90 GB/s) in a 2-way symmetric multiprocessing configuration.
With all three links, there is also support for 4-way SMP. In this configuration, one link from each socket is connected to another socket for an all-for-all connection.
Chipset[edit]
Along with the Hi1620 SoC, HiSilicon developed a number of integrated circuits as part of the chipset platform.
| Chip | Description |
|---|---|
| Hi1620 | CPU, Kunpeng 920 series Chip |
| Hi1503 | CPU interconnect chip, supports scaling-up to 32 sockets |
| Hi1812 | SSD storage controller, for read/write I/O acceleration |
| Hi1822 | Network controller chip, DC high-speed flexible interconnect |
| Hi1710 | BMC management chip + enhanced RAS features chip |
Die[edit]
- TSMC 7 nm HPC
- 20,000,000,000 transistors
- 3-4 dies
All TaiShan v110 Chips[edit]
Bibliography[edit]
- Huawei. Personal Communication. 2019
- Huawei Connect 2018. October 2018
- HiSilicon Event. January 7, 2019
- Huawei, Supercomputing 2018
| codename | TaiShan v110 + |
| core count | 32 +, 48 + and 64 + |
| designer | HiSilicon + |
| first launched | 2019 + |
| full page name | hisilicon/microarchitectures/taishan v110 + |
| instance of | microarchitecture + |
| instruction set architecture | ARMv8.2-A + |
| manufacturer | TSMC + |
| microarchitecture type | CPU + |
| name | TaiShan v110 + |
| process | 7 nm (0.007 μm, 7.0e-6 mm) + |
[转帖]TaiShan v110 - Microarchitectures - HiSilicon的更多相关文章
- 华为TaiShan 2280 ARM 服务器
华为TaiShan 2280 ARM 服务器 华为TaiShan 2280 ARM 服务器 https://e.huawei.com/cn/products/cloud-computing-dc/s ...
- nginx负载均衡基于ip_hash的session粘帖
nginx负载均衡基于ip_hash的session粘帖 nginx可以根据客户端IP进行负载均衡,在upstream里设置ip_hash,就可以针对同一个C类地址段中的客户端选择同一个后端服务器,除 ...
- [转帖]网络协议封封封之Panabit配置文档
原帖地址:http://myhat.blog.51cto.com/391263/322378
- [转帖]零投入用panabit享受万元流控设备——搭建篇
原帖地址:http://net.it168.com/a2009/0505/274/000000274918.shtml 你想合理高效的管理内网流量吗?你想针对各个非法网络应用与服务进行合理限制吗?你是 ...
- 3d数学总结帖
3d数学总结帖,以下是对3d学习过程中数学知识的简单总结 角度值和弧度制的互转 Deg2Rad 角度A1转弧度A2 => A2=A1*PI/180 Rad2Deg 弧度A2转换角度A1 => ...
- [转帖]The Lambda Calculus for Absolute Dummies (like myself)
Monday, May 7, 2012 The Lambda Calculus for Absolute Dummies (like myself) If there is one highly ...
- [转帖]FPGA开发工具汇总
原帖:http://blog.chinaaet.com/yocan/p/5100017074 ----------------------------------------------------- ...
- [Android分享] 【转帖】Android ListView的A-Z字母排序和过滤搜索功能
感谢eoe社区的分享 最近看关于Android实现ListView的功能问题,一直都是小伙伴们关心探讨的Android开发问题之一,今天看到有关ListView实现A-Z字母排序和过滤搜索功能 ...
- AxureRP7.0各类交互效果汇总帖(转)
了便于大家参考,我把这段时间发布分享的所有关于AxureRP7.0的原型做了整理. 以下资源均有对应的RP源文件可以下载. 当然 ,其中有部分是需要通过完成解密游戏[攻略]才能得到下载地址或者下载密码 ...
- 未能加载文件或程序集“Newtonsoft.Json, Version=4.0.0.0, Culture=neutral, PublicKeyToken=30a [问题点数:40分,结帖人u010259408]
未能加载文件或程序集“Newtonsoft.Json, Version=4.0.0.0, Culture=neutral, PublicKeyToken=30a [问题点数:40分,结帖人u01025 ...
随机推荐
- Serverless架构的前世今生
一.Serverless简介 云计算的不断发展,涌现出很多改变传统IT架构和运维方式的新技术,而以虚拟机.容器.微服务为代表的技术更是在各个层面不断提升云服务的技术能力,它们将应用和环境中很多通用能力 ...
- 适合新手的12个Mybatis-Plus常用注解
摘要:MyBatis-Plus(简称 MP)是一个 MyBatis的增强工具,在 MyBatis 的基础上只做增强不做改变,为简化开发.提高效率而生. 本文分享自华为云社区<那些年,我们一起学过 ...
- Python 绑定:从 Python 调用 C 或 C++
摘要:您是拥有想要从 Python 中使用的C或 C++ 库的 Python 开发人员吗?如果是这样,那么Python 绑定允许您调用函数并将数据从 Python 传递到C或C++,让您利用这两种语言 ...
- 云小课 | 网络知识一箩筐——NAT网关,让IP地址华丽变身,轻松实现内外网互通
阅识风云是华为云信息大咖,擅长将复杂信息多元化呈现,其出品的一张图(云图说).深入浅出的博文(云小课) 或短视频(云视厅)总有一款能让您快速上手华为云.更多精彩内容请单击此处. 摘要: 网络知识一箩筐 ...
- 云原生时代,领域驱动设计思想(DDD)如何落地?
摘要:随着数字化世界的持续演进,软件架构设计思想在碰撞中不断优化.云原生时代的到来,加速了行业对于领域驱动设计理念(Domain-Driven Design)的实践落地诉求. 本文分享自华为云社区&l ...
- iOS移动应用安全加固:保护您的App免受恶意攻击的重要步骤
目录 iOS移动应用安全加固:保护您的App免受恶意攻击的重要步骤 摘要 引言 一.APP加固的概念 二.APP加固方案的比较 三.保护iOS应用的安全 四.总结 参考资料 摘要 本文介绍了移动应 ...
- 最高提升10倍性能!揭秘火山引擎ByteHouse查询优化器实现方案
更多技术交流.求职机会,欢迎关注字节跳动数据平台微信公众号,回复[1]进入官方交流群 作为企业级数据库的核心组件之一,查询优化器的地位不可忽视.对于众多依赖数据分析的现代企业来说,一个强大且完善 ...
- 10亿数据、查询<10s,论基于OLAP搭建广告系统的正确姿势
更多技术交流.求职机会,欢迎关注字节跳动数据平台微信公众号,回复[1]进入官方交流群 由于流量红利逐渐消退,越来越多的广告企业和从业者开始探索精细化营销的新路径,取代以往的全流量.粗放式的广告轰炸 ...
- 基于 SpringBoot+vue的地方美食系统,可作为毕业设计
1 简介 这个项目是基于 SpringBoot和 Vue 开发的地方美食系统,包括系统功能模块,管理员功能模块,用户管理模块,功能齐全,可以作为毕业设计,课程设计等.源码下载下来,进行一些简单的部署, ...
- 云原生体系下 Serverless 弹性探索与实践
Serverless 时代的来临 Serverless 顾名思义,是一种"无服务器"架构,因为屏蔽了服务器的各种运维复杂度,让开发人员可以将更多精力用于业务逻辑设计与实现.在 Se ...