Optimizing Physical Implementation and Timing Closure

Planning Physical Implementation

When planning a design, consider the following elements of physical implementation:
• The number of unique clock domains and their relationships
• The amount of logic in each functional block
• The location and direction of data flow between blocks
• How data routes to the functional blocks between I/O interfaces

When adding register stages to pipeline control signals, turn off the Auto Shift Register Replacement
option (Assignments > Settings > Compiler Settings > Advanced Settings (Synthesis)) for these registers. By default, chains of registers can be converted to a RAM-based implementation based on
performance and resource estimates.

Planning FPGA Resources

Plan functional blocks with appropriate global, regional, and dual-regional network signals in mind.

When floorplanning a design, consider the balance of different types of device resources, such as memory, logic, and DSP blocks in the main functional blocks.

Optimizing Timing Closure

You can use physical synthesis optimizations for combinational logic, register retiming, and register duplication techniques to optimize your design for timing closure.

Click Assignments > Settings > Compiler Settings > Advanced Settings (Fitter) to turn on physical
synthesis options.

• Physical synthesis for combinational logic—When the Perform physical synthesis for combinational logic is turned on, the report panel identifies logic that physical synthesis can modify. You can use this information to modify the design so that the associated optimization can be turned off to save compile time.
• Register duplication—This technique is most useful where registers have high fan-out, or where the fan-out is in physically distant areas of the device. Review the netlist optimizations report and consider manually duplicating registers automatically added by physical synthesis. You can also locate the original and duplicate registers in the Chip Planner. Compare their locations, and if the fan-out is improved, modify the code and turn off register duplication to save compile time.
• Register retiming—This technique is particularly useful where some combinatorial paths between registers exceed the timing goal while other paths fall short. If a design is already heavily pipelined, register retiming is less likely to provide significant performance gains since there should not be significantly unbalanced levels of logic across pipeline stages.
The application of appropriate timing constraints is essential to timing closure. Use the following
general guidelines in applying timing constraints:
• Apply multicycle constraints in your design wherever single-cycle timing analysis is not required.
• Apply False Path constraints to all asynchronous clock domain crossings or resets in the design. This technique prevents overconstraining and the Fitter focuses only on critical paths to reduce compile time. However, over constraining timing critical clock domains can sometimes provide better timing results and lower compile times than physical synthesis.
• Overconstrain rather than using physical synthesis when the slack improvement from physical
synthesis is near zero. Overconstrain the frequency requirement on timing critical clock domains by using setup uncertainty.
• When evaluating the effect of constraint changes on performance and runtime, compile the design with at least three different seeds to determine the average performance and runtime effects. Different constraint combinations produce various results. Three samples or more establishes a performance trend. Modify your constraints based on performance improvement or decline.• Leave settings at the default value whenever possible. Increasing performance constraints can increase the compile time significantly. While those increases may be necessary to close timing on a design, using the default settings whenever possible minimizes compile time.

Optimizing Critical Timing Paths

Review the register placement and routing paths by clicking Tools > Chip Planner. Large timing failures
on high fan-out control signals can be caused by any of the following conditions:
• Sub-optimal use of global networks
• Signals that traverse the chip on local routing without pipelining
• Failure to correct high fan-out by register duplication

For high-speed and high-bandwidth designs, optimize speed by reducing bus width and wire usage. To reduce wire use, move the data as little as possible.

推荐 的FPGA设计经验(3) 物理实现和时间闭环优化的更多相关文章

  1. 推荐 的FPGA设计经验(4) 时钟和寄存器控制架构特性使用

    Use Clock and Register-Control Architectural Features FPGAs provide device-wide clocks and register ...

  2. 推荐 的FPGA设计经验(1)组合逻辑优化

    主要内容摘自Quartus prime Recommended Design Practices For optimal performance, reliability, and faster ti ...

  3. 推荐 的FPGA设计经验(2)-时钟策略优化

    Optimizing Clocking Schemes Avoid using internally generated clocks (other than PLLs) wherever possi ...

  4. 至芯FPGA培训中心-1天FPGA设计集训(赠送FPGA开发板)

    至芯FPGA培训中心-1天FPGA设计集训(赠送开发板) 开课时间2014年5月3日 课程介绍 FPGA设计初级培训班是针对于FPGA设计技术初学者的课程.课程不仅是对FPGA结构资源和设计流程的描述 ...

  5. FPGA设计思想与技巧(转载)

    题记:这个笔记不是特权同学自己整理的,特权同学只是对这个笔记做了一下完善,也忘了是从那DOWNLOAD来的,首先对整理者表示感谢.这些知识点确实都很实用,这些设计思想或者也可以说是经验吧,是很值得每一 ...

  6. 【设计经验】2、ISE中ChipScope使用教程

    一.软件与硬件平台 软件平台: 操作系统:Windows 8.1 开发套件:ISE14.7 硬件平台: FPGA型号:XC6SLX45-CSG324 二.ChipScope介绍 ChipScope是X ...

  7. 【转】 FPGA设计的四种常用思想与技巧

    本文讨论的四种常用FPGA/CPLD设计思想与技巧:乒乓操作.串并转换.流水线操作.数据接口同步化,都是FPGA/CPLD逻辑设计的内在规律的体现,合理地采用这些设计思想能在FPGA/CPLD设计工作 ...

  8. FPGA 设计流程,延迟,时间

    FPGA 设计流程,延迟,时间 流程:每个时钟周期可以传输的数据比特. 延迟:从输入到时钟周期的输出数据需要经验. 时间:两个元件之间的最大延迟,最高时钟速度. 1 採用流水线能够提高 流量: 比如计 ...

  9. FPGA设计方法检查表

    -----------------------摘自<FPGA软件测试与评价技术> 中国电子信息产业发展研究院 | 编著------------------------------- 文本格 ...

随机推荐

  1. Oracle练习详解

    --1.查询emp表,显示薪水大于2000,且工作类别是MANAGER的雇员信息 select * from emp where sal > 2000and job = 'MANAGER'; - ...

  2. RabbitMQ的事件总线

    RabbitMQ的事件总线 在上文中,我们讨论了事件处理器中对象生命周期的问题,在进入新的讨论之前,首先让我们总结一下,我们已经实现了哪些内容.下面的类图描述了我们已经实现的组件及其之间的关系,貌似系 ...

  3. hdu 6214 Smallest Minimum Cut[最大流]

    hdu 6214 Smallest Minimum Cut[最大流] 题意:求最小割中最少的边数. 题解:对边权乘个比边大点的数比如300,再加1 ,最后,最大流对300取余就是边数啦.. #incl ...

  4. Tomcat组件启动流程图

    看到一张关于Tomcat组件启动流程图,觉得还可以,收藏.

  5. 【[USACO15JAN]草鉴定Grass Cownoisseur】

    这大概是我写过的除了树剖以外最长的代码了吧 首先看到有向图和重复经过等敏感词应该能想到先tarjan后缩点了吧 首先有一个naive的想法,既然我们要求只能走一次返回原点,那我们就正着反着建两遍图,分 ...

  6. 2018 Multi-University Training Contest 3 Problem F. Grab The Tree 【YY+BFS】

    传送门:http://acm.hdu.edu.cn/showproblem.php?pid=6324 Problem F. Grab The Tree Time Limit: 2000/1000 MS ...

  7. redis安装和简介(2)

    承接上篇未完成的配置...此次使用的的 Redis-x64-3.2.100 版本 一.打开redis服务器 方式一:打开 redis-server.exe 显示如下图: 图中: 显示运行进程号.当前运 ...

  8. Unity3D Errors And Fix

    Author Error: Shader warning in 'Custom/ShowAnimation': Not enough temporary registers, needs 9 (com ...

  9. tomcat如何配置俩个版本

    Java-web除了JDK,还需配置服务器(tomcat); 如何配置俩个版本的tomcat; 1.将tomcat-bin目录下的startup.bat和catalina.bat里的%CATALINA ...

  10. #leetcode刷题之路12-整数转罗马数字

    罗马数字包含以下七种字符: I, V, X, L,C,D 和 M. 字符 数值I 1V 5X 10L 50C 100D 500M 1000 例如, 罗马数字 2 写做 II ,即为两个并列的 1.12 ...