SMART Crossbar

The SMART crossbar is the primary building block in a SMART NoC that enables straight and turning paths within the network.

The idea is to insert a crossbar between the Rx and Tx components of each repeater.

The data sent on the link will first be converted to full-swing (Rx), traverse the full-swing crossbar, then converted back to low-swing again (Tx) and forwarded to the next hop.

Router Microarchitecture

three primary components:

Buffer Write enable (BW_ena): determine if the input signal is latched or not

Bypass Mux select (BM_sel): choose between the local buffered flit, and the bypassing flit on the link

Crossbar select (XB_sel)

Routing

Since the routes are static, we adopt source routing and encode the route in 2 bits for each router.

At the source router, the 2-bit corresponds to East, South, West and North output ports, while at all other routers, the bits correspond to Left, Right, Straight and Core.

The direction Left, Right and Straight are relative to the input port of the flit.

In this work, we avoid network deadlocks by enforcing a deadlock-free turn model across the routes for all flows.

Flow control

A router needs to keep track of free VCs at the endpoint of an arbitrary SMART route, though it does not know the SMART route till runtime.

We solve this problem by using a reverse credit mesh network, similar to the forward data mesh network that delivers flits.

The only overhead of the credit mesh network is a [log(# VCs) + 1 (valid)]-bit SMART crossbar added at each router.

For example, if the number of VCs is 2, the overhead of the credit network is 2-bit wide crossbars. If a forward route is preset, the reverse credit route is preset as well.

A credit that traverses multiple hops does not enter the intermediate routers and goes directly to the SMART crossbar which redirects it along the correct direction.

Low-swing signaling
In general, the low-swing technique can lower energy consumption and propagation delay at the cost of a reduced noise margin.

The heart of our SMART NoC is a novel low-swing clockless repeated link circuit (asynchronous repeaters, a pair of inverters) embedded within the router crossbars, that allows packets to potentially bypass all the way from source to destination core within a single clock cycle, without being latched at any intermediate router.

Replacing clocked link drivers by asynchronous repeaters at every hop.

HPC_max

The maximum number of bypass hops, or maximum hops-per-cycle (HPC_max),

is a design-time parameter, constrained by the clock period of system, tile size, and the wire delay of data links between routers.

SMART router pipeline

SA-L (Switch Allocation Local): every start router chooses a winner for each output port from among its buffered (local) flits.

SSR: they broadcast a SMART-hop setup request (SSR) via dedicated repeated wires up to HPC_max; the SSR carries the length (in hops) up to which the flit winner wishes to go.

SSR = min(HPC_max, H_remaining)

SA-G: all inter routers arbitrate among the SSRs they receive, to set the BW_ena, BM_sel and XB_sel signals

arbitration policies:

Prio=Local: Local flits have higher priority over bypass flits, i.e. Priority = 1/(hops_from_start_router).

Prio=Bypass: Bypass flits have higher priority over local flits, i.e. Priority = (hops_from_start_router).

 Implementation of SA-G at W_in and E_out

SA-G SSR-priority-arbiter arbitrates the received SSRs from W->E dimension and chooses the nearest SSR.

SA-G output port checks whether there is a request from local buffered flits. If not, the signal XB will be asserted.

In the stage SA-G input port, if there is no transmitting packets, the bypass request will be granted.

ST+LT: SA-L winners that also won SA-G at their start routers traverse the crossbar and links upto multiple hops till they are stopped by BW_ena at some router.

In summary, a SMART NoC works as follows:

  • Buffered flits at injection/start routers arbitrate locally to choose input/output port winners during SA-L.
  • SA-L winners broadcast SSRs along their chosen routes, and each router arbitrates among these SSRs during SA-G.
  • SA-G winners traverse multiple crossbars and links asynchronously within a cycle, till they are explicitly stopped and buffered at some router along their route.

Architecture的更多相关文章

  1. Undefined symbols for architecture arm64解决方案

    在iOS开发中经常遇到的一个错误是Undefined symbols for architecture arm64,这个错误表示工程某些地方不支持arm64指令集.那我们应该怎么解决这个问题了?我们不 ...

  2. Optimal Flexible Architecture(最优灵活架构)

    来自:Oracle® Database Installation Guide 12_c_ Release 1 (12.1) for Linux Oracle base目录命名规范: /pm/s/u 例 ...

  3. EF框架组件详述【Entity Framework Architecture】(EF基础系列篇3)

    我们来看看EF的框架设计吧: The following figure shows the overall architecture of the Entity Framework. Let us n ...

  4. [Architecture] 系统架构正交分解法

    [Architecture] 系统架构正交分解法 前言 随着企业成长,支持企业业务的软件,也会越来越庞大与复杂.当系统复杂到一定程度,开发人员会发现很多系统架构的设计细节,很难有条理.有组织的用一张大 ...

  5. Stack Overflow: The Architecture - 2016 Edition

    To get an idea of what all of this stuff “does,” let me start off with an update on the average day ...

  6. ios build时,Undefined symbols for architecture xxx问题的总结

    简单来说,Undefined symbols基本上等于JAVA的ClassNotFoundException,最常见的原因有这几种: build的时候没有加framework 比如说,有一段代码我用了 ...

  7. Undefined symbols for architecture x86_64: "_OBJC_CLASS_$_The49DayPersonalFullscreenGiftModel", referenced from: objc-class-ref in The49DayPersonalRoomGiftModel.o ld: symbol(s) not found for a

    Undefined symbols for architecture x86_64: "_OBJC_CLASS_$_The49DayPersonalFullscreenGiftModel&q ...

  8. ios开发错误之: Undefined symbols for architecture x86_64

    错误如下: Undefined symbols for architecture x86_64: "_OBJC_CLASS_$_RoutingHTTPServer", refere ...

  9. util-linux编译unknown architecture 'BSD_LABELSECTOR' undeclared错误

    ------------------------------------------------------------------------------ In :: fdiskbsdlabel.h ...

  10. 解决duplicate symbols for architecture x86_64错误

    duplicate symbols for architecture x86_64 两个不第三方SDK之间的文件里面内容重复了,类似 file.h+file.m 和 CHfile.h+CHfile.m ...

随机推荐

  1. 把leveldb嵌入到redis.实现真正的数据持久存储

    目前最新版RedisStorage 是基于 redis 2.6.2基础上,加上 leveldb存储引擎. 这个项目是源于 公司项目的passport 用户认证改造.公司一个项目运行了N年.积累了几千万 ...

  2. SA9 collections

    [定义]  表示object的集合 generic class:可以用于多种object, 抽象类的具体实现: [ArrayList] 动态添加,只能加Non-primitive type,要初始化长 ...

  3. C盘文件过大,C盘空间莫名丢失,pagefile.sys文件

    设置显示隐藏文件和受保护的系统文件后此文件会显示pagefile.sys这个是window的页面文件,作为系统的虚拟内存使用,和你的物理内存一样大,你内存已经很大了,不用开虚拟内存了.在系统高级属性下 ...

  4. JS 获取屏幕的宽度和高度,各种方式

      Javascript: 网页可见区域宽: document.body.clientWidth网页可见区域高: document.body.clientHeight网页可见区域宽: document ...

  5. Django之ModalForm

    ModelForm 自己定义的form--->Form--->BaseForm 自己定义的ModelForm--->ModelForm--->BaseModelForm---& ...

  6. 操作系统的发展史 day36

    什么是操作系统       可能很多人都会说,我们平时装的windows7 windows10都是操作系统,没错,他们都是操作系统.还有没有其他的? 想想我们使用的手机,Google公司的Androi ...

  7. pycharm 配置支持vue

    http://www.cnblogs.com/c-x-m/p/9229199.html

  8. “windows的批处理”与“Linux的shell script”的类比学习

    从2005年开始,做了将近10年的系统维护,先是做网络接入管理,然后做网络安全与审计,然后做服务器管理等整个网络系统的运营管理:现在又兼着做一些Linux下的视频监控系统的软硬件维护.过程中遇到太多重 ...

  9. c# 多个事件公用一个相应方法判断事件来源

    假设下边的相应方法有多个事件共同使用.根据事件的sender 判断来源,做相应的处理 假设事件来源DataManSystem;private void OnSystemConnected(object ...

  10. 抽象类,override,final和类模板

    抽象类: **有些函数由于信息不够具体,而无法实现** 由此而来的纯虚函数:在基类中声明的纯虚函数,在基类中无法实现(是因为在基类中定义的信息不够具体,不是学的知识不够),于是这个函数没办法规定具体的 ...