SMART Crossbar

The SMART crossbar is the primary building block in a SMART NoC that enables straight and turning paths within the network.

The idea is to insert a crossbar between the Rx and Tx components of each repeater.

The data sent on the link will first be converted to full-swing (Rx), traverse the full-swing crossbar, then converted back to low-swing again (Tx) and forwarded to the next hop.

Router Microarchitecture

three primary components:

Buffer Write enable (BW_ena): determine if the input signal is latched or not

Bypass Mux select (BM_sel): choose between the local buffered flit, and the bypassing flit on the link

Crossbar select (XB_sel)

Routing

Since the routes are static, we adopt source routing and encode the route in 2 bits for each router.

At the source router, the 2-bit corresponds to East, South, West and North output ports, while at all other routers, the bits correspond to Left, Right, Straight and Core.

The direction Left, Right and Straight are relative to the input port of the flit.

In this work, we avoid network deadlocks by enforcing a deadlock-free turn model across the routes for all flows.

Flow control

A router needs to keep track of free VCs at the endpoint of an arbitrary SMART route, though it does not know the SMART route till runtime.

We solve this problem by using a reverse credit mesh network, similar to the forward data mesh network that delivers flits.

The only overhead of the credit mesh network is a [log(# VCs) + 1 (valid)]-bit SMART crossbar added at each router.

For example, if the number of VCs is 2, the overhead of the credit network is 2-bit wide crossbars. If a forward route is preset, the reverse credit route is preset as well.

A credit that traverses multiple hops does not enter the intermediate routers and goes directly to the SMART crossbar which redirects it along the correct direction.

Low-swing signaling
In general, the low-swing technique can lower energy consumption and propagation delay at the cost of a reduced noise margin.

The heart of our SMART NoC is a novel low-swing clockless repeated link circuit (asynchronous repeaters, a pair of inverters) embedded within the router crossbars, that allows packets to potentially bypass all the way from source to destination core within a single clock cycle, without being latched at any intermediate router.

Replacing clocked link drivers by asynchronous repeaters at every hop.

HPC_max

The maximum number of bypass hops, or maximum hops-per-cycle (HPC_max),

is a design-time parameter, constrained by the clock period of system, tile size, and the wire delay of data links between routers.

SMART router pipeline

SA-L (Switch Allocation Local): every start router chooses a winner for each output port from among its buffered (local) flits.

SSR: they broadcast a SMART-hop setup request (SSR) via dedicated repeated wires up to HPC_max; the SSR carries the length (in hops) up to which the flit winner wishes to go.

SSR = min(HPC_max, H_remaining)

SA-G: all inter routers arbitrate among the SSRs they receive, to set the BW_ena, BM_sel and XB_sel signals

arbitration policies:

Prio=Local: Local flits have higher priority over bypass flits, i.e. Priority = 1/(hops_from_start_router).

Prio=Bypass: Bypass flits have higher priority over local flits, i.e. Priority = (hops_from_start_router).

 Implementation of SA-G at W_in and E_out

SA-G SSR-priority-arbiter arbitrates the received SSRs from W->E dimension and chooses the nearest SSR.

SA-G output port checks whether there is a request from local buffered flits. If not, the signal XB will be asserted.

In the stage SA-G input port, if there is no transmitting packets, the bypass request will be granted.

ST+LT: SA-L winners that also won SA-G at their start routers traverse the crossbar and links upto multiple hops till they are stopped by BW_ena at some router.

In summary, a SMART NoC works as follows:

  • Buffered flits at injection/start routers arbitrate locally to choose input/output port winners during SA-L.
  • SA-L winners broadcast SSRs along their chosen routes, and each router arbitrates among these SSRs during SA-G.
  • SA-G winners traverse multiple crossbars and links asynchronously within a cycle, till they are explicitly stopped and buffered at some router along their route.

Architecture的更多相关文章

  1. Undefined symbols for architecture arm64解决方案

    在iOS开发中经常遇到的一个错误是Undefined symbols for architecture arm64,这个错误表示工程某些地方不支持arm64指令集.那我们应该怎么解决这个问题了?我们不 ...

  2. Optimal Flexible Architecture(最优灵活架构)

    来自:Oracle® Database Installation Guide 12_c_ Release 1 (12.1) for Linux Oracle base目录命名规范: /pm/s/u 例 ...

  3. EF框架组件详述【Entity Framework Architecture】(EF基础系列篇3)

    我们来看看EF的框架设计吧: The following figure shows the overall architecture of the Entity Framework. Let us n ...

  4. [Architecture] 系统架构正交分解法

    [Architecture] 系统架构正交分解法 前言 随着企业成长,支持企业业务的软件,也会越来越庞大与复杂.当系统复杂到一定程度,开发人员会发现很多系统架构的设计细节,很难有条理.有组织的用一张大 ...

  5. Stack Overflow: The Architecture - 2016 Edition

    To get an idea of what all of this stuff “does,” let me start off with an update on the average day ...

  6. ios build时,Undefined symbols for architecture xxx问题的总结

    简单来说,Undefined symbols基本上等于JAVA的ClassNotFoundException,最常见的原因有这几种: build的时候没有加framework 比如说,有一段代码我用了 ...

  7. Undefined symbols for architecture x86_64: "_OBJC_CLASS_$_The49DayPersonalFullscreenGiftModel", referenced from: objc-class-ref in The49DayPersonalRoomGiftModel.o ld: symbol(s) not found for a

    Undefined symbols for architecture x86_64: "_OBJC_CLASS_$_The49DayPersonalFullscreenGiftModel&q ...

  8. ios开发错误之: Undefined symbols for architecture x86_64

    错误如下: Undefined symbols for architecture x86_64: "_OBJC_CLASS_$_RoutingHTTPServer", refere ...

  9. util-linux编译unknown architecture 'BSD_LABELSECTOR' undeclared错误

    ------------------------------------------------------------------------------ In :: fdiskbsdlabel.h ...

  10. 解决duplicate symbols for architecture x86_64错误

    duplicate symbols for architecture x86_64 两个不第三方SDK之间的文件里面内容重复了,类似 file.h+file.m 和 CHfile.h+CHfile.m ...

随机推荐

  1. redis其他相关知识

    Redis的安全性 因为redis速度很快,所以在一台比较好的服务器下,一个外部用户在一秒内可以进行15W次的密码尝试,这意味着你需要设定非常强大的密码来防止暴力破解. vim编辑redis.conf ...

  2. 公告栏添加时钟——利用canvas画出一个时钟

    前言 最近在学习HTML5标签,学到Canvas,觉得很有趣.便在慕课网找了个demo练手.就是Canvas时钟. 对于canvas,w3shcool上是这么描述的: HTML5 <canvas ...

  3. [LeetCode_96] Unique Binary Search Trees

    题目链接 https://leetcode.com/problems/unique-binary-search-trees/ 题意 计算给定节点数的BST有多少种 思路 递归 相关知识 二叉搜索树(B ...

  4. 在linux 中启动anaconda

    anaconda-navigator   $ source ~/anaconda3/bin/activate root ###在bin  目录下打开终端 敲 ./activate root   $ a ...

  5. f5 2017.09.03故障

    1.下午14点50左右有同事反应epm等系统登录有问题.自测登录也是有同样的报错. 2.测试发现内部IP直接访问正常,但是访问f5的vip的方式访问不了.此时oa.邮件等系统也开始有同事发现故障. 3 ...

  6. 13-算法训练 P0505

    算法训练 P0505   时间限制:1.0s   内存限制:256.0MB      一个整数n的阶乘可以写成n!,它表示从1到n这n个整数的乘积.阶乘的增长速度非常快,例如,13!就已经比较大了,已 ...

  7. 6-完美解决Error:SSL peer shut down incorrectly

    转载自: 完美解决Error:SSL peer shut down incorrectly 打开gradle文件夹下的gradle-wrapper文件 修改其中的配置文件将红色区域修改为http:// ...

  8. 如何查看Chrome浏览器保存的账号密码

    之前告诉大家如何一键查看所有保存在IE里的所有密码(点击查看原文),现在来告诉大家如何一键查看Chrome浏览器的所有密码.某种意义上上,查看Chrome的密码比查看IE的更简单,因为查看IE密码还需 ...

  9. MyEclipse中抽取接口、父类

    选中要抽取接口的类-------->Refactor-------->Extact Interface-------->填写抽取的接口名-------->选择要抽取的方法(一般 ...

  10. encode/decode/bytes

    python3中如何将字符型转换成utf-8格式的bytes类型 str_me = '字符是我'.encode('utf-8') print(str_me) >>:b'\xe5\xad\x ...