PCI Verilog IP
1 PCI IP设计
虽然PCI已经逐渐淘汰,但是还是有不少应用需要这样的接口通讯。
设计目的是为了提供基于源码的PCI IP,这样硬件就不必受限于某一个FPGA型号,也方便ASIC迁移。由于PCI的电气标准都是标准3.3V电平,不像PCIe需要高速收发器、8b/10b编码等技术的支持,因此设计一个基于源码的PCI IP是完全可行的,并且我们设计的IP也确实经过了验证。
1.1 功能需求
l 接收FPGA其它模块的参数设置,由外部参数决定何时启动、停止PCI数据传输
l 兼容PCI 2.2 协议,33Mhz主频,32bit地址/数据
l 工作在Target模式,可接收数据读写、寄存器读写
l 总线状态监测,具备错误汇报能力,并根据错误类型决定后续动作,比如重试、停止当前传输
1.2 IP特性
l 标准32位33Mhz PCI Target接口;
l Wishbone master接口,支持block/burst读写;
l 数据均使用小端模式,可以方便在windows上进行存储,符合常用习惯;
l 具备标准PCI配置寄存器空间,参数支持修改;
l BAR0 register, 在PCI memory map空间占据32Mbytes;
l 需要支持的PCI指令如下:
0110 Memory Read
0111 Memory Write
1010 Configuration Read
1011 Configuration Write
1100 Memory Read Multiple
1110 Memory Read Line
1111 Memory Write and Invalidate
l 支持对PCI读写进行重试,用户可通过wishbone master接口发起;
1.3 接口定义/引脚描述

图 5‑1 PCI Core接口与参数列表(在VIVADO中的形式)
接口主要分成2个部分:
l PCI target接口;
l Wishbone master接口;
参数说明:
l Bars “1BARMEM"/"1BARIO",支持memory 模式和IO模式,推荐使用默认memory模式;
l Wb endian: wishbone总线使用大端还是小端模式,推荐使用默认小端模式;
l Wb size: wishbone bus size,推荐使用默认32 ;
l Class code ID: PCI class code id;
l Device ID: PCI device id,可以自由指定,当前设备的ID;
l Revision ID: PCI Revision ID,可以自由指定,当前固件的版本;
l Subsystem ID: 子系统ID,一般和Device ID相同,也可以不同;
l Subsystem Vendor ID: 子系统厂商ID,由用户指定,一般会和Vendor ID相同;
l Vendor ID: PCI提供商ID,表面设备生产厂商,正式的Vendor ID需要向PCI SIG申请,但是在特定系统使用时,可以由用户自己指定,只要不和现有设备冲突即可;
因为CPU依靠Vendor ID和Device ID来区分不同类型的设备。
接口类型声明
引脚输入输出方向:
In standard input only
out standard output only
t/s Tri-State is a bi-directional, tri-state input/output pin
s/t/s 在时序上要对这类信号多加注意,避免时序出错。Sustained Tri-State is an active low tri-state signal owned and driven by one and only one agent at a time. The agent that drives an s/t/s pin low must drive it high for at least one clock before letting it float. A
new agent cannot start driving a s/t/s signal any sooner than one clock
after the previous owner tri-states it. A pullup is required to sustain
the inactive state until another agent drives it and must be provided by
the central resource.
o/d Open Drain allows multiple devices to share as a wire-OR. A pull-up is required to sustain the inactive state until another agent drives it and must be provided by the central resource.
1.3.1 内部用户接口
内部接口即用户接口,是wishbone master接口,本小节介绍接口定义与关键时序。
| 名称 | 方向 | 分组 | 定义 | 
| wb_adr_o | out | Internal pins | Whisbone address | 
| wb_dat_i | in | Internal pins | Whisbone data in | 
| wb_dat_o | out | Internal pins | Whisbone data out | 
| wb_sel_o | out | Internal pins | Whisbone data byte selection . The select output array [SEL_O()] indicates where valid data is expected on the [DAT_I()] signal array during READ cycles, and where it is placed on the [DAT_O()] signal array during WRITE cycles. 表征当前bit对应的byte数据有效,数据周期一般是0xf,表示32bit数据全部有效 | 
| wb_we_o | out | Internal pins | Whisbone write enable. The write enable output [WE_O] indicates whether the current local bus cycle is a READ or WRITE cycle. The signal is negated during READ cycles, and is asserted during WRITE cycles. 表征当前操作是读还是写 | 
| wb_stb_o | out | Internal pins | Whisbone data strobe. The strobe output [STB_O] indicates a valid data transfer cycle. It is used to qualify various other signals on the interface such as [SEL_O()]. The SLAVE asserts either the [ACK_I], [ERR_I] or [RTY_I] signals in response to every assertion of the [STB_O] signal 数据有效标志 | 
| wb_cyc_o | out | Internal pins | Whisbone cycle. The cycle output [CYC_O], when asserted, indicates that a valid bus cycle is in progress. The signal is asserted for the duration of all bus cycles. For example, during a BLOCK transfer cycle there can be multiple data transfers. The [CYC_O] signal is asserted during the first data transfer, and remains asserted until the last data transfer. 会在整个数据传输周期保持有效 | 
| wb_ack_i | in | Internal pins | Whisbone acknowledge. The acknowledge input [ACK_I], when asserted, indicates the normal termination of a bus cycle.如果是block传输,则可能一直有效 | 
| wb_rty_i | in | Internal pins | Whisbone retry. The retry input [RTY_I] indicates that the interface is not ready to accept or send data, and that the cycle should be retried. Slave要求master重试 | 
| wb_err_i | in | Internal pins | Whisbone error. The error input [ERR_I] indicates an abnormal cycle termination. | 
| wb_int_i | in | Internal pins | Whisbone interrupt. 中断引脚,实际并不使用 | 
1.3.2 Wish bone读写时序

图 5‑2 PCI通过wishbone接口进行单次读取,这里可以用做读写用户寄存器,wb_stb_o&wb_cyc_o==1时,当前读请求有效,slave在下一个时钟周期进行回应,同时返回指定地址的数据

图 5‑3 PCI通过wishbone接口进行单次写入,这里可以用做用户寄存器写入,wb_stb_o&wb_cyc_o==1时,当前读请求有效,slave在下一个时钟周期进行回应,表面请求写入的动作已经完成。

图 5‑4 PCI block read方式读,可以用做DMA数据读取,PCI的读操作转换成wishbone的读操作,wb_stb_o&wb_cyc_o==1时,当前读请求有效,在下一个时钟周期返回有效数据,最后一个周期的读请求无需响应。

图 5‑5 PCI block write方式写入,可以用做DMA数据写入,PCI的读操作转换成wishbone的写操作,wb_stb_o&wb_cyc_o==1时,当前写请求有效,在下一个时钟周期返回写完成相应,最后一个周期的写请求无需响应。
1.3.3 PCI物理接口
参考来自PCI SPEC V2.2
表 5‑3 PCI接口
| 名称 | 方向 | 分组 | 定义 | 
| clk33 | in | System pins | Clock provides timing for all transactions on PCI and is an input to every PCI device. All other PCI signals, except RST#, INTA#, INTB#, INTC#, and INTD#, are sampled on the rising edge of CLK and all other timing parameters are defined with respect to this edge 该信号应来自板载晶振 | 
| rst | in | System pins | Reset is used to bring PCI-specific registers, sequencers, and signals to a consistent state. Anytime RST# is asserted, all PCI output signals must be driven to their benign state. In general, this means they must be asynchronously tri-stated. REQ# and GNT# must both be tristated (they cannot be driven low or high during reset). To prevent AD, C/BE#, and PAR signals from floating during reset, the central resource may drive these lines during reset (bus parking) but only to a logic low level; they may not be driven high. 该信号来自板载复位 | 
| ad [31:0] | t/s | Address and Data Pins | Address and Data are multiplexed on the same PCI pins. A bus transaction consists of an address2 phase followed by one or more data phases. PCI supports both read and write bursts. | 
| cbe [3:0]# | t/s | Address and Data Pins | Bus Command and Byte Enables are multiplexed on the same PCI pins. During the address phase of a transaction, C/BE[3::0]# define the bus command (refer to Section 3.1. for bus command definitions). During the data phase, C/BE[3::0]# are used as Byte Enables | 
| par | t/s | Address and Data Pins | Parity is even parity across AD[31::00] and C/BE[3::0]#. Parity generation is required by all PCI agents. Parity generation is not optional, it must be done by all PCI-compliant devices 偶检验是必须实现的功能 | 
| frame# | s/t/s | Interface Control Pins | Cycle Frame is driven by the current master to indicate the beginning and duration of an access. FRAME# is asserted to indicate a bus transaction is beginning. While FRAME# is asserted, data transfers continue. When FRAME# is deasserted, the transaction is in the final data phase or has completed. 读写时序图上可以看到传输是如何结束的 | 
| irdy# | s/t/s | Interface Control Pins | Initiator Ready indicates the initiating agent’s (bus master’s) ability to complete the current data phase of the transaction. IRDY# is used in conjunction with TRDY#. A data phase is completed on any clock both IRDY# and TRDY# are asserted. During a write, IRDY# indicates that valid data is present on AD[31::00]. During a read, it indicates the master is prepared to accept data. Wait cycles are inserted until both IRDY# and TRDY# are asserted together. | 
| trdy# | s/t/s | Interface Control Pins | Target Ready indicates the target agent’s (selected device’s) ability to complete the current data phase of the transaction. TRDY# is used in conjunction with IRDY#. A data phase is completed on any clock both TRDY# and IRDY# are asserted. During a read, TRDY# indicates that valid data is present on AD[31::00]. During a write, it indicates the target is prepared to accept data. Wait cycles are inserted until both IRDY# and TRDY# are asserted together. | 
| stop | s/t/s | Interface Control Pins | Stop indicates the current target is requesting the master to stop the current transaction. | 
| devsel | s/t/s | Interface Control Pins | Device Select, when actively driven, indicates the driving device has decoded its address as the target of the current access. As an input, DEVSEL# indicates whether any device on the bus has been selected. | 
| idsel | in | Interface Control Pins | Initialization Device Select is used as a chip select during configuration read and write transactions. | 
| perr# | s/t/s | Error Reporting Pins | Parity Error is only for the reporting of data parity errors during all PCI transactions except a Special Cycle. The PERR# pin is sustained tri-state and must be driven active by the agent receiving data (when enabled) two clocks following the data when a data parity error is detected. The minimum duration of PERR# is one clock for each data phase that a data parity error is detected. | 
| serr# | o/d | Error Reporting Pins | System Error is for reporting address parity errors, data parity errors on the Special Cycle command, or any other system error where the result will be catastrophic. | 
| intb# | o/d | Interrupt Pins | Interrupt C is used to request an interrupt and only has meaning on a multi-function device 本项目不会使用该引脚 | 
下面是客户提供的名称,其中REQ和GNT是Master专用,本设计实际并不需要
- STOP
- SERR
- PERR
- RST
- FRAME
- TRDY
- DEVSEL
- PAR
- CLK
- INTC
- REQ[3:0] Bus Masters Only,因此不需要
- GNT[3:0] Bus Masters Only,因此不需要
- CBE[3:0]
- AD[31:0]
1.4 状态机
本部分参考PCI SPEC V2.2中参考状态机进行修改设计。
以下是spec对状态机的描述
“Caution needs to be taken when an agent is both a master and a target. Each must have its own state machine that can operate independently of the other to avoid deadlocks. This means that the target state machine cannot be affected by the master state machine. Although they have similar states, they cannot be built into a single machine.”

图 5‑6 PCI Core状态机示意图,实际会多出2个状态
实际为了将PCI操作转化为Wishbone读写,状态S_DATA2和TURN_ARE状态,这两个状态主要目的是满足信号时序逻辑。
状态机状态:PCIIDLE, B_BUSY, S_DATA1, S_DATA2, BACKOFF, TURN_ARL, TURN_ARE。详细跳转过程可以参照代码进行理解,文档对该部分暂时不做详细说明。
1.5 PCI关键时序
该部分将用时序图的方式,分别对PCI的各种操作给出PCI Spec要求的时序、以及本项目提供的PCI Core实际仿真时序。其中仿真时序中IRDY信号并未严格满足要求(该信号来源于PCI master,本文档不需要提供PCI master,只是提供仿真模型),但是并不影响实际结果。

图 5‑7 读配置寄存器

图 5‑8 读取device id 0x0001,vendor id 0x4150,各占用16bit,其中vendor id占用低16bit,当前图示时序的irdy比理论多出一个有效的时钟周期,这只是仿真逻辑pci master的问题,且并不影响数据传输的正确性,因此可以忽略

图 5‑9 写入PCI配置寄存器,base address registers, 地址0x10,写入数据0x3000_0000

图 5‑10 PCI 配置寄存器列表

图 5‑11 从用户地址空间读取用户寄存器

图 5‑12 向用户地址空间写入寄存器

图 5‑13 从用户地址空间以block的方式读出数据

图 5‑14 向用户地址空间以block的方式写入数据
在仿真代码中,加入对寄存器读写的测试。
start_read =0;
addr =0;
length=1;
rst=1;
#200;
rst=0;
//vendor id device id
pci_configuration_read(0,data_value);
assert(data_value=='h14150);
//cmd status
pci_configuration_read(4,data_value);
assert(data_value=='h02000002);
pci_configuration_write('h10,'h3000_0000);
pci_configuration_read(0,data_value);
assert(data_value=='h14150);
pci_configuration_read('h10,data_value);
assert(data_value=='h3000_0000);
pci_memory_read('h3000_0000,data_value);
assert(data_value==0);
pci_memory_read('h3000_0004,data_value);// address add 4 every time, must increment WBSIZE/8
assert(data_value==1);
pci_memory_read('h3000_0008,data_value);// address add 4 every time
assert(data_value==2);
pci_memory_write('h3000_0000,100);
pci_memory_read('h3000_0000,data_value);
assert(data_value==100);
pci_memory_write('h3000_0000,0);//重新写0
pci_memory_read_line('h3000_0000,4,data_value);//read line
pci_memory_write_line('h3000_0000,4,111);//write line
pci_memory_read_line('h3000_0000,4, data_value);//read line
pci_memory_read_line('h3000_0000,5, data_value);//read line
pci_memory_read_line('h3000_0000,6, data_value);//read line
pci_memory_read_line('h3000_0010,10, data_value);//read line
1.6 支持的PCI指令
l 支持的PCI指令
0110 Memory Read
0111 Memory Write
1010 Configuration Read
1011 Configuration Write
1100 Memory Read Multiple
1110 Memory Read Line
1111 Memory Write and Invalidate
下面是一些spec对相关指令的描述,用户并不需要特别关注下面的内容,这只是开发人员的记录。

从上面这段话可以看出,单纯的memory read 是一次只能读取4个字节,效率很低。
而memory read line/ memory read multiple自治领推荐实现cache line size 寄存器。下面这段话可以看出cache line size 寄存器所起的作用和约束。

It is highly recommended that the Cacheline Size register be implemented to ensure
correct use of the read commands. 这个只是针对master设备而言,target设备只负责接收请求。

Bridge会读取比master更多的数据来实现高性能数据传输。
那么bridge是有责任负责存储预取的隐藏数据责任,最简单的方式是认为该部分数据无效并抛弃,需要注意,这样的方式不进行小心的处理可能导致数据的丢失。

从6章节关于cache line size寄存器的描述可以看出,对于target类型的slave设备,除非需要支持cacheline wra addressing模式,否则是可以不实现cache line size的,数据的传输大小由master设备决定。


图 5‑15 block read

数据传输的终止,取决于master的frame信号,frame信号为高表示要终止传输,下一个数据即最后一个数据。
PCI Verilog IP的更多相关文章
- PCI Express
		1.1课题研究背景 在目前高速发展的计算机平台上,应用软件的开发越来越依赖于硬件平台,尤其是随着大数据.云计算的提出,人们对计算机在各个领域的性能有更高的需求.日常生活中的视频和图像信息包含大量的数据 ... 
- 转载:Why using Single Root I/O Virtualization (SR-IOV) can help improve I/O performance and Reduce Costs
		Introduction While server virtualization is being widely deployed in an effort to reduce costs and o ... 
- Spartan6系列之器件引脚功能详述
		1. Spartan-6系列封装概述 Spartan-6系列具有低成本.省空间的封装形式,能使用户引脚密度最大化.所有Spartan-6 LX器件之间的引脚分配是兼容的,所有Spartan-6 L ... 
- uboot学习之BL3的流程
		BL2的最后通过汇编调用了board_init_r函数,此时进入BL3的阶段,此时的主要工作: 这一阶段涉及的文件及任务如下 arch/arm/lib/board.c 1. boa ... 
- Multi-Channel PCIe QDMA Subsystem
		可交付资料: 详细的用户手册 Design File:Post-synthesis EDIF netlist or RTL Source Timing and layout constraints,T ... 
- 如何将自己写的verilog模块封装成IP核
		如何将自己写的verilog模块封装成IP核 (2014-11-21 14:53:29) 转载▼ 标签: 财经 分类: 我的东东 =======================第一篇========= ... 
- 第II篇PCI Express体系结构概述
		虽然PCI总线取得了巨大的成功,但是随着处理器主频的不断提高,PCI总线提供的带宽愈发显得捉襟见肘.PCI总线也在不断地进行升级,其位宽和频率从最初的32位/33MHz扩展到64位/66MHz,而PC ... 
- 基于Verilog语言的可维护性设计技术
		[注]本文内容主体部分直接翻译参考文献[1]较多内容,因此本文不用于任何商业目的,也不会发表在任何学术刊物上,仅供实验室内部交流和IC设计爱好者交流之用. “曲意而使人喜,不若直节而使人忌:无善而致人 ... 
- 网络知识学习1---(基础知识:ISO/OSI七层模型和TCP/IP四层模型)
		以下的内容和之后的几篇博客只是比较初级的介绍,想要深入学习的话建议自己钻研<TCP/IP详解 卷1:协议> 1.ISO/OSI七层模型 下四层是为数据传输服务的,物理层是真正的传输数 ... 
随机推荐
- JMeter_调试取样器(Debug Sampler)
			大家在调试 JMeter 脚本时有没有如下几种需求: 我想知道参数化的变量取值是否正确! 我想知道正则表达式提取器(或json提取器)提取的值是否正确! 我想知道 JMeter 属性! 调试时服务器返 ... 
- MySQL主从复制作用和配置
			一.复制概述 Mysql内建的复制功能是构建大型,高性能应用程序的基础.将Mysql的数据分布到多个系统上去,这种分布的机制,是通过将Mysql的某一台主机的数据复制到其它主机(slaves)上,并重 ... 
- 怎样用命令行导入注册表 .reg 文件
			https://stackoverflow.com/questions/49676660/how-to-run-the-reg-file-using-powershell Get-Command re ... 
- 实验 1 :Mininet  源码安装和可视化拓扑
			实验 1 : Mininet 源码安装和可视化拓扑工具 一 .实验目的 掌握 Mininet 的源码安装方法和 miniedit 可视化拓扑生成工具. 二 .实验任务 使用源码安装 Mininet 的 ... 
- [转]JS正则表达式基础
			1. 正则表达式的概念 正则表达式(regular expression)描述了一种字符串匹配的模式.这种模式,我们可以理解成是一种"规则".根据这种规则再去匹配符合条件的结果,而 ... 
- Keil MDK STM32系列(八) STM32F4基于HAL的PWM和定时器输出音频
			Keil MDK STM32系列 Keil MDK STM32系列(一) 基于标准外设库SPL的STM32F103开发 Keil MDK STM32系列(二) 基于标准外设库SPL的STM32F401 ... 
- 聊聊dubbo协议
			搜索关注微信公众号"捉虫大师",后端技术分享,架构设计.性能优化.源码阅读.问题排查.踩坑实践. 协议 协议通俗易懂地解释就是通信双方需要遵循的约定. 我们了解的常见的网络传输协议 ... 
- Pyomo+GLPK使用
			Pyomo下载安装 GLPK的下载和安装参考上一篇博客. mkdir Pyomo cd Pyomo wget https://github.com/Pyomo/pyomo/archive/5.6.6. ... 
- 新设备关联Gitlab
			新设备关联Gitlab 1:创建SSH Key.在用户主目录下,看看有没有.ssh目录,如果有,再看看这个目录下有没有id_rsa和id_rsa.pub这两个文件,如果已经有了,可直接跳到下一步.如果 ... 
- ES6随笔D1
			1.数值解构赋值 ES6 允许按照一定模式,可以从数组中提取值,按照对应位置,对变量赋值,这被称为解构. 解构赋值的规则是,只要等号右边的值不是对象或数组,就先将其转为对象.由于undefined和n ... 
