CDC跨时钟域处理
- 格式:pdf
- 大小:1.16 MB
- 文档页数:34
Advanced Verification White Paper Five Steps to QualityCDC VerificationPing Yeung Ph.D.Mentor GraphicsCDC synchronizers are used to reduce the probability of metastable signals. Taking unpredictable metastable sig-nals and creating predictable behavior, they prevent metastable values from reaching the receiving clock domain.Metastability EffectsEven when proper CDC synchronizers are used for all clock-domain crossings and all CDC protocols are cor-rectly implemented, metastability inevitably leads to unpredictable cycle-level timing [4, 5]. Traditional RTL simulation does not model metastability, therefore, it cannot be used to find functional problems that may arise when metastability manifests in hardware. We are going to show two scenarios in which the cycle-level timing of RTL simulation differs from the cycle-level timing of the actual hardware in the presence of metastability.In Figure 3, the incoming CDC signal, cdc_d , violates the register setup time. Although it is sampled correctly in RTL simulation, the register is metastable and the output settles to 0. As a result, the hardware transition is delayed by one cycle.Figure 2: A two-register CDC synchronizer.Figure 4: Hold time violation: hardware transition is advanced by one cycle.Figure 3: Setup time violation: hardware transition isdelayed by one cycle.In Figure 4, the incoming CDC signal, cdc_d, violates the register hold time. In RTL simulation, it is not sam-pled until the next cycle. However, the register is metastable and the output settles to 1. As a result, the hardware transitions one cycle before simulation.CDC VerificationMany designers know that metastability can be controlled using synchronizers on CDC signals. The most com-mon solution is to use synchronizers made up of two D flip-flops (2-DFF), or more, in sequence to dramatically increase the mean time between failure (MTBF) of the crossings. It is important to point out that simply ensur-ing the presence of synchronizers on the appropriate signals, while necessary, is not nearly sufficient. There are three different aspects of CDC verification that must be carefully addressed:•Structural verification. Each synchronizer must have the correct structure for the type of signal being sent across clock domains. For example, a 2-DFF synchronizer is usually the best solution for single-bit signals but should not be used for multi-bit signals unless they are gray-coded to ensure that only one bit changes at a time [1, 2]. Multi-bit signals may be synchronized across domains using a separate control signal, an asynchronous FIFO, or other methods. Also, there should be no combinational logic inside or before a synchronizer.•Protocol verification. Each synchronizer must follow a set of rules, called a transfer protocol, to ensure that the CDC signal is properly transferred across clock domains. For example, even the simplest 2-DFF synchronizer requires that the transmitting signal be held stable long enough to guarantee that it is captured in the receiving domain. This may not occur if the transmitting clock is faster than the receiving clock. Synchronization structures for multi-bit signals require more complex protocol checks [2, 3]. When CDC transfer protocols are violated, an error may not occur in simulation but will eventually occur in real hardware.•Metastability verification. Problems associated with the reconvergence of CDC signals must be avoided. Reconvergence occurs when multiple signals are synchronized separately from one clock domain to another and then used by the same logic in the receiving domain (Figure 5). If that logic assumes a timing relationship between the signals, the design is not tolerant of metasta-Figure 5: Reconvergence of CDC signals.running CDC verification at the chip-level is not idealas there are too many behavioral models. For this reason, designs with the RTL representation arebest suited for CDC verification. All multiple-bit regis-ters and buses can be identified explicitly. As CDCrequirements for data and control signals are different,they can be analyzed separately. Conversely, with a gate-level representation, all buses have been synthesized intosingle-bit wires. It is impossible to distinguish data fromcontrol signals. Thus, only a subset of the CDC schemeis applicable.In addition, knowledge of the internal structure of thedesign is essential for successful CDC verification.Performing CDC analysis on third-party IP blocks orlegacy designs are not very beneficial. Without designknowledge, you may not be able to confirm the bugs orfilter violations. You need access to the original design-er; knowledge of the design is essential to understandthe results.Document CDC requirements This step is important. It has the biggest impact on the quality of the result. You need to ensure that your CDC verification will be aware of all relevant design characteristics, clock relationships, and the external environ-ment. Although a set of CDC rules have been defined already in the tool, with this additional information, extra design rules and filters will be activated. As a result, the results will be more accurate and thorough. You should define the operational modes of the design, outline the clock structures, document the domains for interface sig-nals, and determine the acceptable or unacceptable synchronization rules.•Define operational modes . Some users may be interested in verifying that all registers are driven by only the test clock when the design is in scan mode. However, in general, we want to perform CDC verification when the design is in normal operation. You should disable the non-functional modes (i.e., test, BIST, JTAG , etc.)and define the meaningful operational modes and configurations for the tool (Figure 8). In many cases, some parts of a design can be put into sleep or powered-down mode. The combinations can be countless. Hence, it is important to focus CDC verification on the important combinations.•Outline clock structures . For a multiple-clock design, the primary clocks, the clock distribution structure, and the internal clock generators, dividers, and clock gating schemes are important and should be documented. The 0-In CDC tool [7] can understand these structures and extract the clock tree information automatically. It is useful to verify the original design intent with the extracted clock structures. Only crossings between asynchronous clocks should be analyzed, and there may be clock gating conditions that need to be set up properly. As the number of CDC paths Figure 7: Blocks suitable for running CDC verification are A, C, and D.Figure 8: Define meaningful modes for CDC verification.is proportional to the number of clock domains, it is much more efficient to limit analysis to the relevant clock domains only.•Document interface signals.CDC verification will not check an interface signal if it does not know which clock domain it is coming from or fanning out to. You should group signals for each functional interface and clock domain and explicitly identify any input signals that are asynchronous to all clock domains. The tool will then determine whether input signals require synchronization before use. Asynchronous and synchro-nous reset signals should be labeled separately. The synchronization schemes used for reset signals are dif-ferent from normal interface signals.•Define synchronization rules.Some companies require all CDC signals to be synchronized with a particular CDC scheme (for instance, 3-level DFFs) or with a custom synchronization cell. The 0-In CDC tool [7] rec-ognizes many CDC schemes. It is important to review those schemes first. Then you can categorize the legal or illegal synchronization schemes, turn off the unused schemes and capture the characteristics of the syn-chronization cells or modules.Some detail-oriented designers will specify the clock domain information and capture the CDC paths in the design document. This is a practice we encourage. With this information, you can pay special attention to the specified CDC paths to ensure they are well covered.Formalize known exceptionsIt is not unusual that a simple CDC error will generate a lot of violations, especially when the CDC signal fans out to a large number of asynchronous domains. Hence, we want to identify upfront all registers that can be con-sidered stable and do not require synchronization. This is especially true for configuration registers, status regis-ters, or control registers that are programmed by software before the design enters into normal operation. Usually this information is in the design specification. Often these control registers all reside within a single module or follow a common naming scheme. It is useful to leverage this information for CDC verification.In addition to registers, some of the interface or internal signals are known to be stable during normal opera-tion; for example, the internally generated reset signals, the power-down signals, the chip select, and functional enable and disable signals. These are all useful in improving the efficacy of CDC verification. Finally, in previ-ous revisions of the chip, there may be known CDC bugs and issues found using manual or gate-level method-ologies.. These are painful lessons which should be captured in the verification plan. During CDC verification, you should ensure that those CDC paths are analyzed and that all problems were fixed correctly.Define coverage goalsBased on our experience, CDC paths are poorly verified by traditional functional verification because, in a functional simulation environment, the clock periods are defined to be constants, the clock skews and relation-ships are fixed, and the clock signals are well-behaved without any buffer or propagation delay. To ensure metastability effects on CDC paths are better verified, companies use various ad-hoc approaches. Some ran-domly reduce or prolong the clock periods. Some randomly change the skews and relationships among multiple asynchronous clocks.Regardless of the technique used, when verifying CDC transfer protocols in simulation, it is important to moni-tor protocol coverage to make sure the CDC paths are adequately verified. This can be done by using a checker from an assertion library [9, 10, 11] to capture the semantics of the protocol. Checkers collect coverage infor-mation to ensure that each CDC protocol is fully exercised. Insufficient coverage means that the design may contain undiscovered bugs.CDC Handshake:- Assertion:i.multiple requests violationii.acknowledge without request violationiii.request drop violationiv.acknowledge timeout violation- Coverage:i.#request assertedii.#acknowledge assertedCDC FIFO:- Assertion:i.FIFO overflow violationii.FIFO underflow violationiii.Simultaneous push and pop violation- Coverage:i.#push assertedii.#pop assertediii.maximum FIFO entryYou start by itemizing the protocols used in the CDC paths, ordering them from most important to least, and identifying the protocols that require coverage. For each CDC protocol, you will determine the assertion and coverage items, as in the code shown above. To measure how well your simulation is verifying the metastability effects, you should also monitor the alignments of the RX and the TX clocks.Select a verification strategyYou may choose to perform exhaustive verification on each individual block or use a hierarchical approach on the entire top-level of a chip. The top-level represents the highest functional hierarchy of the design, excluding the pad ring, the test logic, the power controls, etc. You may be looking for a known bug or hunting for a prob-lem that has appeared in the lab. The strategies you choose will determine how to run your CDC analysis tools. Based on our experience, there are four common strategies•Block-level verification.During block-level design, static analysis of the CDC structures should be run before checking in the RTL code. This ensures strict compliance with CDC schemes. If issues are discov-ered, they can be identified and debugged quickly at this level. The generated CDC protocol monitors can also be used with block-level functional verification methodologies, such as formal verification [8] and simulation. These ensure that CDC protocols are followed without any data loss.•Top-level verification. During top-level integration, static analysis of the CDC structures should be re-run to check the new CDC signals created when multiple blocks are integrated together. The number of CDC signals goes up exponentially at the top-level. Hence, it is important to follow the five-step planning processdescribed above. For large and complex designs, especially ones with a lot of IP modules, a hierarchical veri-fication approach can be used. The generated CDC protocol monitors should be included in the regression for system-level functional verification.•Bug hunting and triage. This is performed to identify known or suspected CDC issues in the design. It is important to distinguish timing from CDC issues. Timing issues tend to cause the chip to fail consistently. By changing the frequencies of the clocks or of some of the signals, the problems may disappear permanently. On the other hand, as metastability is unpredictable, CDC problems will cause the chip to fail randomly. Based on our experience, it is extremely difficult to look for CDC issues at the system or chip level. There are just too many potential candidates. Hence, the first task is to narrow down the problem to the block or subsystem level.You should focus on blocks with asynchronous clock domains. Blocks with CDC reconvergence are especially suspicious. The random delays from different CDC paths may cause the data to be sampled incorrectly or cor-rupted completely. Once the candidate blocks are identified, the block-level verification strategy can be applied.•Targeting coverage.This can be an extension of the block-level or top-level verification strategy. The goal is to ensure that all CDC protocols have been fully exercised and that the metastability effects are fully verified on all CDC paths. The coverage on the CDC protocols monitors should be examined and additional tests created to fill any coverage holes. Once the bug rate of system-level regression has stabilized, metastability can be injected into simulation. Metastability effects will change the timing of the CDC paths. If the block is not designed to handle the random delays of the CDC paths, it may fail functionally. To make debugging easier, it is better to start with a representative subset of the regression.In the next two sections, we will elaborate on the two most commonly used strategies: block-level andtop-level verification.•Correctly implemented synchronizers•Missing and incorrectly implemented synchronizers•Complex synchronizers that require protocol verification•Potential reconvergence problemsAt this level, since the design description contains a lot of modules with a lot of asynchronous clock domains, most of the CDC paths fall into the complex and reconvergence categories. The GUI environment is particularly useful in this case as the CDC path may span multiple levels of hierarchy and several modules.2. Run protocol verification with simulation. With the testbench environment already available for functional veri-fication, we can run the CDC protocol monitors within the simulation environment. These monitors ensure that the CDC signal is stable when going from the TX to the RX domain; the multiple-bit CDC data is gray-coded, or it is stable when it is sampled. Any assertion failure caught by the monitors means that the CDC protocol is vio-lated and should be fixed. Functional simulation gives lengthy margins for CDC signals. They tend to be suffi-ciently stable when going through the clock domains. This approach does a poor job stressing the timing of the CDC paths. Hence, when running CDC protocol monitors with functional simulation, it is important to examine the corner case coverage of the protocol monitors. Monitors which pass should have adequate coverage when the results from the whole regression suite are merged together. In order to stress the timing of the CDC paths explic-itly, we can add some direct tests into the regression suite. With a constrained-random verification environment, we can tighten the timing of the stimulus generator.3. Run metastability verification with simulation and effect injection.If your design contains potential reconver-gence violations that cannot be easily waived after manual inspection, the 0-In CDC tool [7] can be used to iden-tify reconvergence problems by injecting metastability effects dynamically. In simulation, metastability effect injectors will change the delays through synchronizers. As a result, any logic that assumes a fixed timing rela-tionship between the outputs of the synchronizers is likely to fail. Running simulation with the effect injectors is the most cost effective way to imitate CDC metastability effects during functional verification. It enables any potential problems to be discovered quickly, before any hardware is built.Debugging CDC ViolationsIn this section, we will describe a few common CDC violations and the techniques you can use to determine whether they are real design issues or not.Missing synchronizer violationAn unsynchronized CDC signal is the most common violation reported during structural verification. For instance, in Figure 11, depending on the clock frequency, the RX register may not be able to sample its input reliability. This is a real problem. In other cases, violation are due to the following exceptions.References1. Clifford E. Cummings, “Synthesis and Scripting Techniques for Designing Multi-Asynchronous Clock Designs,” SNUG-2001. Downloadable from /papers2. Tai Ly, “The Need for an Automated Clock Domain Crossing Verification Solution,” White Paper, Downloadable from /fv3. Chris Kwok, et al, “Using Assertion-Based Verification to Verify Clock Domain Crossing Signals,” DVCon 20034. Tai Ly, et al, “Formally Verifying Clock Domain Crossing Jitter Using Assertion-Based Verification,” DVCon 20045. Tai Ly, et al, “A Methodology for Verifying Sequential Reconvergence of Clock Domain Crossing Signals,” DVCon 20056. Harry Foster, et al, “Assertion-Based Design,” Kluwer Academic Publishers, 20037. Mentor Graphics, “CDC Compiler User Guide V2.5,” Feb 20078. Mentor Graphics, “Formal Verification User Guide V2.5,” Feb 20079. Accellera, “Accellera Standard OVL Library Reference Manual,” July 200610. Mentor Graphics, “QuestaTM Verification Library Checkers Data Book V6.2f,” Jan 2007.11. Mentor Graphics, “CheckerWare Data Book Assertion Library V2.5,” Feb 2007Corporate Headquarters Mentor Graphics Corporation 8005 S.W. Boeckman RoadPacific RimMentor Graphics TaiwanRoom 1603, 16F,EuropeMentor GraphicsDeutschland GmbHSilicon ValleyMentor Graphics Corporation1001 Ridder Park DriveJapanMentor Graphics Japan Co., Ltd.Gotenyama HillsCopyright © 2007 Mentor Graphics Corporation. This document contains information that is proprietary to Mentor Graphics Corporation and may be duplicated in whole or in part by the original recipient for internal business purposed only, provided that this entire notice appears in all copies. In accepting this document, the recipient agrees to make every reasonable effort to prevent the unauthorized use of this information. 0-In and Mentor Graphics are registered trademarks of Mentor Graphics Corporation. All other trademarks are the property of their respective owners.For more information, visit /fv。
高速数据的跨时钟域处理方法及验证侯宏录;齐晶晶【摘要】In order to solve the image acquisition and mismatch transmission rate in high speed image acquisition system,the internal storage resources of FPGA is used and the principles of asynchronous FIFO is introduced to analyze the meta-stable state and full/empty signal to achieve asynchronous FIFO using Verilog HDL and QuartusII tools macro module.The results show that the cross clock domain of high speed data transmission is achieved,when the write clock is 82 MHz and the read clock is 50 MHz.%为了解决高速相机数据采集和处理速率的不匹配问题,利用现场可编程逻辑门阵列内部存储资源,研究了高速、大容量异步 FIFO 的工作原理,提出了异步 FIFO 工作中的亚稳态和空/满标识问题,采用 Verilog HDL 编写时序代码和 QuartusII 工具宏模块定制两种方法实现异步 FIFO.研究结果表明:当写入时钟为82 MHz,异步 FIFO 可实现的读出时钟为50 MHz,实现了高速数据采集和传输系统的跨时钟域处理.【期刊名称】《西安工业大学学报》【年(卷),期】2015(000)006【总页数】7页(P434-440)【关键词】异步 FIFO;现场可编程逻辑门阵列;跨时钟域;数据传输【作者】侯宏录;齐晶晶【作者单位】西安工业大学光电工程学院,西安 710021;西安工业大学光电工程学院,西安 710021【正文语种】中文【中图分类】TM615随着微电子技术和图像传感器技术的发展,高速视频采集技术已经广泛应用于航天航空、医学图像分析、现代工业自动化生产、道路交通和科学研究中.高速视频采集系统能够记录肉眼无法分辨的过程,在后期回放过时,将高速过程清晰地展现出来,为数字图像处理、分析和目标识别等提供了依据.高速图像采集和处理系统中包含多个时钟,数据在不同时钟域传输的周期和相位完全独立,因此必须对采集到的数据进行跨时钟域处理才能保证数据的传输无丢失[1].对于不同时钟域间的数据传输,文献[2]提出了一种基于符号化模型检验工具SMV的异步先进先出队列(First Input First Output,FIFO)的模型验证方法,利用SMV对该系统模型和系统属性进行了验证,有效解决跨时钟域信号传输产生的亚稳态问题;文献[3]提出了一种在FPGA内实现高速异步FIFO的方法;文献[4]利用异步FIFO实现现场可编程逻辑门阵列(Field Programmable Gate Array,FPGA)与数字信号处理器(Digital Signal Processor,DSP)进行数据通信的方案,该方案具有传输速度快、稳定可靠和实现方便的优点;文献[5]主要针对多时钟域下的片上网络的数据同步进行了研究,分析了多时钟域下片上网络跨时钟域数据传输时的亚稳态问题.FPGA内部资源丰富,通常可根据需要将FPGA内部M9K存储器模块配置成单端口、简单双端口、真双端口随机存取存储器(Random Access Memory,RAM)、FIFO缓冲器及只读存储器(Read Only Memory,ROM)[6].利用FPGA 片内存储资源实现的异步FIFO是一种快速有效的解决方案,FPGA内部FIFO比外部FIFO芯片更能提高系统稳定性.本文为了实现高速视频采集和传输系统中数据的跨时钟域处理,提出了基于FPGA 的异步FIFO设计方案.该方案通过两种方式实现:①利用Verilog硬件描述语言(Hardware Description Language,HDL)来实现异步FIFO;②利用QuartusII工具中的宏模块调用实现异步FIFO.通过在高速视频采集和传输系统中的实验,验证了两种方法的正确性.1 异步FIFO工作原理异步FIFO是一种先进先出存储器,先进入的数据先读出,读时钟和写时钟互相独立[7].异步FIFO有两套数据线,可在一端进行写操作的同时在另一端进行读操作,在数据顺序传输的同时实现数据的缓存.1.1 异步FIFO结构异步FIFO包括写时钟域和读时钟域,异步FIFO的核心是由双端口RAM组成的存储单元.访问FIFO时不需要地址线,只需要数据线和读写控制信号线.在写时钟域,写端口对应写数据信号和写控制信号,写入的数据存储在双端口RAM中;在读时钟域,读端口对应读数据信号和控制信号,数据从双端口RAM中读出并送入下一级[8].异步FIFO最重要的控制信号Full(满)、Empty(空)、Almost Full(将满)、Almost Empty(将空)由写地址和读地址相互比较生成.异步FIFO存储器内部结构如图1所示.确定异步FIFO的空/满状态需要二进制读指针和写指针的比较.FIFO中的读写指针值随着读写操作的进行不断累加,当计数器满后返回并继续从0开始循环.图1 异步FIFO存储器内部结构Fig.1 The internal structure of asynchronous FIFO memory1.2 亚稳态的处理亚稳态是指触发器无法在规定时间内达到确定状态.当触发器进入亚稳态时,无法预测输出电平,也无法预测何时输出稳定在正确的电平上.在亚稳态期间,触发器输出中间电平或者振荡状态,输出电平沿信号通道上的各个触发器级联式传播下去.在数字电路中,为了保证每个寄存器的输入信号正确,所有器件的信号传输都有一定的时序要求,输入寄存器的信号必须在时钟沿的寄存器建立时间(Tsu)之前保持稳定,并且持续到时钟沿的寄存器保持时间(Th)之后改变,信号从寄存器的输入到输出需要经过一定的延迟(Tco).此时,系统的每一个寄存器都有一个稳定的状态1或者0,且寄存器的输出电压在下级门电路的噪声容限范围内[9].而如果信号的变化不在Tsu和Th的要求内,寄存器的输出会处在高电平1和低电平0之间,寄存器的输出达到高或者低的稳定状态时需要的时间大于Tco,即输出的电压处在下级门电路的噪声容限范围外时,则寄存器输出处于亚稳态.亚稳态通常发生在跨时钟域传递数据的系统中,由于数据信号可能在任何时间到达异步时钟域的目的寄存器,从而无法保证满足Tsu和Th的要求.图2所示为跨时钟域传输数据中的异步时钟,前级寄存器的工作时钟为clka,后级寄存器的工作时钟为clkb.当前一级的输入数据发生改变时,后一级电路在读数据的同时可能产生亚稳态.亚稳态不能够完全消除,只能使其出现的概率降低.由于二进制计数器从一个时钟域到另外一个时钟域同步计数时可能出现计数器的多位同时翻转,经过同步器后的数值产生多种结果,格雷码计数器每次累加操作后只有一个比特发生翻转,因此在进行时钟域转换时,不会出现其他不确定值,以便正确比较读写指针,准确产生空满信号.因此,采用格雷码可有效减少亚稳态的发生.本文在通过Verilog HDL实现异步FIFO时采用格雷码来降低亚稳态的出现概率.图2 异步时钟Fig.2 Asynchronous clock1.3 空/满标志异步FIFO中最重要的控制信号由读写指针相互比较生成.为了保证数据在FIFO中正确写入和读出,避免写满和读空的产生,判读空/满标志位的产生成为异步FIFO设计的关键.为解决上述问题,对指针进行附加位比较.为读写指针最高位增加一位附加位,当读指针读完存储器中的存储单元后,会向附加位进一,除附加位外的所有位均为零,写指针亦然.若读写地址指针的最高位不同而其余位相同时,表明写地址比读地址多产生了一个循环,即FIFO存储器为满.若读写地址指针所有位都相同时,表明读写地址指针循环次数相等,即FIFO存储器此时状态为空.综上所述,读写地址指针有n位地址空间,其中低n-1位用来存放FIFO的读写地址.2 异步FIFO的实现FPGA内部资源丰富,为了简化系统结构,通常可根据需要来配置FPGA内部存储资源.本设计中采用两种方法通过FPGA内部块RAM来实现异步FIFO,用于解决图像采集和处理两个模块不同时钟域的数据传输,异步FIFO缓存图像采集模块传输过来的图像数据,用于下级FPGA控制器按照数据传输格式读取数据,同时将像素数据作跨时钟域处理,系统采集到的图像数据由采集系统82MHz时钟控制写入异步FIFO,图像数据经异步FIFO缓存后,由图像处理模块从异步FIFO中读出.2.1 Verilog HDL实现异步FIFO本设计中,数据采集的分辨率为1 280×1 024,每个像素点包含RGB各8位的数据,一行传输的图像数据为1 280个24位的图像数据.异步FIFO接口信号说明见表1~2.表1 Verilog HDL实现的异步FIFO输入信号Tab.1 The input signal of asynchronous FIFO implemented by Verilog HDL信号名 wr_clk wr_rst wr_en data[23:0] rd_clk rd_rst方向输入输入输入输入输入输入位宽/位 1 1 1 24 1 1信号说明写时钟,数据采集的时钟82MHz写复位,高有效写使能,高有效,数据采集的输出写数据,数据采集的输出读时钟,50MHz读复位,高有效表2 Verilog HDL实现的异步FIFO输出信号Tab.2 The output signal of asynchronous FIFO implemented by Verilog HDL信号名 rd_en dout[23:0] aempty empty afull full方向输入输出输出输出输出输出位宽/位 1 24 1 1 1 1信号说明读使能,高有效读数据,数据从FIFO 输出将空标识信号空标识信号将满标识信号满标识信号结合上文对异步FIFO设计中的关键问题分析,给出Verilog HDL设计中二进制转格雷码及满信号产生的关键代码为二进制转格雷码类似可写出空标志位产生模块.2.2 QuartusII生成异步FIFO在EDA设计时可直接调用QuartusII软件中内置的宏模块来简化设计过程.本文使用Quartus II中MegaWizard宏模块向导定制FIFO,FPGA器件选择CycloneIV,定制FIFO时需要设置FIFO的数据宽度、深度和类型等参数.FIFO类型选择异步,数据宽度选择24位,深度选择1 024,即定制1 280个位宽24位存储深度1 024的缓存单元.利用宏模块定制的异步FIFO生成的信号见表3~4.表3 QuartusII生成的异步FIFO输入信号Tab.3 The input signal of asynchronous FIFO implemented by QuartusII信号名 wrclk wrreq data[23:0] rdclk rdreq wrfull方向输入输入输入输入输入输出位宽/位 1 1 24 1 1 1信号说明写时钟,数据采集的时钟82MHz写请求,高有效写数据,数据采集的输出读时钟,50MHz读请求,高有效FIFO满信号,高电平有效表4 QuartusII生成的异步FIFO输出信号Tab.4 The output signal of asynchronous FIFO implemented by QuartusII信号名 wrempty wrusedw [9:0] q[23:0] rdfull rdempty rdusedw[9:0]方向输出输出输出输出输出输出位宽/位 1 10 24 1 1 10信号说明FIFO空信号,高电平有效FIFO中字节数,与写时钟同步读数据,数据从FIFO输出FIFO满信号,高电平有效FIFO空信号,高电平有效FIFO中字节数,与读时钟同步当写请求wrreq为高电平时,在写时钟wrclk的上升沿,将24位的像素数据data[23:0]在82 MHz时钟周期下写入FIFO的写指针的指向单元,同时写指针加1,写指针指向下一个要写入的数据单元,直到wrreq为低电平或wrfull为高电平时停止写入数据,已写入的数据按行存储.当读请求rdreq为高电平时,在读时钟rd_clk的上升沿,将24位像素数据q[23:0]在50MHz时钟周期下从读指针指向单元读出,同时读指针加1,读指针指向下一个数据单元,直到rdreq为低电平或rdfull为高电平时停止数据读出.3 异步FIFO性能测试结果及分析高速视频采集系统结构如图3所示,为了测试本文设计的异步FIFO是否满足要求,首先对设计进行了仿真验证,再将验证后的异步FIFO应用到高速视频采集系统中,通过观察系统显示结果,判断设计出的异步FIFO适合本系统.图3 系统结构框图Fig.3 The system structure diagram3.1 逻辑代码仿真验证使用Verilog HDL编写测试脚本和测试用例,调用第三方软件ModelSim进行仿真验证如图4所示,得到FIFO中的信号输出波形图.图中显示异步FIFO的写入时钟wr_clk频率高于读出时钟rd_clk频率,当写使能wr_en信号为高时,数据在写时钟的上升沿写入RAM.在读出时钟rd_clk的上升沿,当读使能rd_en 信号为高时,数据在读时钟的上升沿由RAM中读出.当将满标识信号afull有效时,在时钟信号的下一个上升沿停止写操作,写满wr_full信号有效.当将空标识信号aempty有效时,在时钟信号的下一个上升沿停止读操作,读空rd_full信号有效.图4 异步FIFO仿真验证Fig.4 The simulation and verification of asynchronous FIFO3.2 系统性能测试本设计应用于高速视频采集和传输系统,采集系统输出8路视频数据,数据经过设计的异步FIFO后,传输给FPGA控制板,最后在FPGA控制下图像数据以60Hz的刷新频率显示在液晶显示器上.采集系统用到的时钟为82MHz,即FIFO的写入时钟应为82MHz.用示波器跟踪数据写入FIFO时的时钟得到波形如图5所示,波形显示为82MHz.图5 异步FIFO写入时钟Fig.5 The write clock of asynchronous FIFOFPGA控制板的时钟为50MHz,即FIFO的读出时钟应为50MHz.用示波器跟踪数据从FIFO中读出的时钟得到波形如图6所示,波形显示为50MHz.图6 异步FIFO读出时钟Fig.6 The read clock of asynchronous FIFO采用Verilog HDL时序代码和QuartusII工具宏模块定制两种方法实现的异步FIFO经过仿真验证后,在Altera公司Cyclone IV系列的FPGA芯片EP4CGX150DF31上进行实验验证,在满足高速视频采集系统技术指标的前提下,对静态目标进行实时采集,将采集到的图像经过异步FIFO缓存后,通过液晶显示屏观察显示效果.利用高速视频采集系统拍摄室外行驶的汽车,采集到的图像经异步FIFO做跨时钟域处理后,通过液晶显示器显示,得到4帧连续的图像,如图7所示.由图7可以看出,高速相机采集到的图像数据经过异步FIFO缓存后,在显示器上按照从左向右,从上到下的方式扫描显示出来,与异步FIFO先进先出的工作原理一致.表明本文设计的异步FIFO切实可行.图7 跨时钟域处理后的图像Fig.7 The image processing of the cross clock domain3.3 测试结果分析通过对采集到的高速数据进行跨时钟域处理,从液晶显示器显示的画面可知:异步FIFO性能达到了技术指标要求,但在调试过程中画面出现抖动,经过分析发现在数据写入异步FIFO的过程中同步信号未严格对齐;采集到的高速数据经异步FIFO做跨时钟域处理时,未对采集的曝光进行调整,曝光时间太短,图像显示过暗;本文设计的异步FIFO带有空/满标志位,导致信号通路延时对整个系统工作频率产生制约,为了避免上述问题,可在异步FIFO设计中省略“满”信号,只保留“空”信号产生模块.4 结论1)本文基于高速视频采集和传输系统,提出了异步FIFO储器解决跨时钟域传输数据的解决方案.在分析异步FIFO工作原理的基础上,结合FPGA内部存储资源,采用两种方法实现异步FIFO的设计,分别是Verilog HDL编写逻辑代码实现的异步FIFO和Quartus II中MegaWizard宏模块向导定制的异步FIFO.2)采用Verilog HDL编写时序代码实现的异步FIFO将二进制的指针改为格雷码,有效减少了数据缓存过程中亚稳态,并在空/满信号产生的同时避免了写满和读空的产生.本文设计的异步FIFO具有良好的通用性和可移植性.通过Quartus II中MegaWizard宏模块向导定制的异步FIFO快速有效,在结构相对复杂的系统中可大大简化系统结构.3)对文中设计的异步FIFO,通过实验验证,解决了高速数据采集和传输系统的前后数据传输速率不匹配问题,满足预期设计目标,为后续的研究工作奠定基础.【相关文献】[1]卢博,王军.异步FIFO在DSP图像采集系统中的应用[J].单片机与嵌入式系统应用,2015,32(1):57.LU Bo,WANG Jun.Application of Asynchronous FIFO in DSP Image Acquisition System[J].Micro-controllers and Embedded Systems,2015,32(1):57.(in Chinese)[2]刘彬.异步FIFO的设计与形式化验证[D].长沙:国防科技大学,2011.LIU Bin.Design and Formal Verification of Asynchronous FIFO[D].Changsha:National University of Defense Technology,2011.(in Chinese)[3]黄忠朝,赵于前.一种实现高速异步FIFO的FPGA方法[J].计算机工程与应用,2010,46(3):13.HUANG Zhong-chao,ZHAO Yu-qian.Implementation Method of High Speed Asynchronous FIFO U-sing FPGA[J].Computer Engineering and Applications,2010,46(3):13.(in Chinese)[4]胡波,李鹏.异步FIFO在FPGA与DSP通信中的运用[J].电子科技,2011,24(3):53.HU Bo,LI Peng.Application of Asynchronous FIFO in Communication Between FPGA and DSP[J].Electronic Science and Technology,2011,24(3):53.(in Chinese)[5]赵文晗.多时钟域下片上网络同步研究[D].成都:电子科技大学,2013.ZHAO Wen-han.Research on Synchronous Multi-Clock-Domain Network-on-Chip[D].Chengdu:University of Electronic Science and Technology of China,2013.(in Chinese)[6]施华钧.高效异步FIFO的设计实现[D].长沙:湖南大学,2013.SHI Hua-jun.Design and Implementation of Efficient Asynchronous FIFO[D].Changsha:Hunan University,2013.(in Chinese)[7]向厚振,张志杰,王鹏.基于FPGA视频和图像处理系统的 FIFO 缓存技术[J].电视技术,2012,36(9):41.XIANG Hou-zhen,ZHANG Zhi-jie,WANG Peng.FIFO Cache Technology in Video and Image Processing System Based on FPGA[J].Video Engineering,2012,36(9):41.(in Chinese)[8]庾志衡,叶俊明,邓迪文.基于FPGA与DDR2SDRAM的大容量异步FIFO缓存设计[J].微型机与应用,2011,30(4):34.YU Zhi-heng,YE Jun-ming,DENG Di-wen.ADesign of High Speed and Deep Asynchronous FIFO Based on FPGA and DDR2SDRAM [J].Microcomputer &Its Applications,2011,30(4):34.(in Chinese)[9]司岚山.一种高速大容量异步FIFO存储器的设计[D].无锡:江南大学,2013.SI Lan-shan.Design of High Speed and Large Capacity Asynchronous FIFO Memory[D].Wuxi:Jiangnan University,2013.(in Chinese)。
IC设计基础系列之CDC篇2:clockdomaincrossing(CDC)(二跨时钟域。
一般来讲,如果设计中存在有多个时钟域,那么就必然会存在跨时钟域的timing path。
如果对跨时钟域的timing path处理不当,则容易导致亚稳态,glitch,多路扇出,重新聚合等等问题,导致设计不能稳定工作或者就根本不能正常工作。
1. 亚稳态对时序逻辑电路来说,一个DFF的输入信号必须在该DFF的时钟沿前后一段时间内都保持稳定才能保证DFF能锁存到正确的值。
这既我们所说的setup time和hold time,其中信号在时钟沿之前的保持时间为setup time,信号在时钟沿之后的保持时间为hold time。
正常情况下,如果DFF的输入能满足setup time和hold time的要求,那么在tCO(the clock to output delay)时间内DFF的输出就会达到一个有效的逻辑值(高电平或者低电平)。
否则,DFF的输出就需要远大于tCO的时间来达到有效的逻辑值,这段时间内,DFF的输出信号是不稳定的,被称为不稳定状态,或者叫亚稳态。
在下图中,如果CLK B在DA变化的时候来对DA进行采样,那么DB就会出现亚稳态。
对于同时钟域的信号。
无论是在ASIC设计还是在FPGA设计中,我们也可以方便的通过STA来保证同时钟域的信号能满足setup/hold time的要求,不会出现亚稳态的问题。
但对于异步信号,相位关系是完全不可控的,而且会随时间发生变化,这就必然会存在亚稳态的问题,而且STA工具也没有办法对不同时钟域之间的timing path进行分析。
也就是说,我们是没有办法完全避免异步信号之间的亚稳态问题的,但是可以通过在跨时钟域的信号上加入一些特殊的电路来减少亚稳态问题对电路功能所产生的负面影响。
2. Glitch前面我们讲过,STA工具室不会对跨时钟域的信号做STA。
跨时钟域的信号很容易产生glitch并最终影响电路功能。
跨时钟域问题的解决2⽉18⽇跨时钟域问题(Clock Domain Crossing) –同两个时钟域打交道!引⾔:设计者有时候需要将处于两个不同时钟域的系统对接,由于接⼝处是异步(会产⽣setuptime 和holdtime violation,亚稳态以及不可靠的数据传输)的,因此处理起来较同步逻辑更棘⼿,需要寻求特殊处理来进⾏接⼝界⾯的设计。
任意的两个系统如果满⾜以下条件之⼀,就可称其为异步的:(1)⼯作在不同的时钟频率上;(2)⼯作频率相同,但是相位不相同;处理跨时钟域的数据传输,有两种实现⽅案:(1)采⽤握⼿信号来交互(2)以异步FIFO来实现1.1、以握⼿信号交互:假设系统A以这种⽅式向系统B传递数据,握⼿信号分别为req和ack。
握⼿协议:Transmitter asserts the req (request) signal, asking the receiver to accept the data on the data bus.Receiver asserts the ack (acknowledge) signal, asserting that it has accepted the data.这种处理跨时钟域的⽅式很直接,但是也最容易产⽣亚稳态,由于系统A发送的req信号需要系统B中的时钟去sample,⽽系统B发出的ack信号⼜需要系统A中的时钟去sample,这样两边都存在着setup time和hold time violation的问题。
为了避免由于setup time和hold time vilation所造成的亚稳态,通常我们可以将异步时钟域交互的信号⽤local system的时钟打两级甚⾄三级寄存器,以此来消除亚稳态的影响。
下图以系统A发送到系统B的req信号⽰例消除亚稳态的⽅法:当然,这种处理⽅式是以损失传输速率为代价的,加⼊两到三级寄存器同步异步时钟域的信号,会有许多时钟周期浪费在了系统的“握⼿”。
跨时钟域处理方法跨时钟域处理(Cross-ClockDomainProcessing)也被称为跨时钟域通信(CCDC),是一种在不同的时钟芯片或部件间实现通信的方法。
它可以帮助企业减少制造时间,降低成本,提高性能和灵活性,并允许更快地向市场投入新产品。
跨时钟域处理技术可以减少能耗,更有效地为更多的应用程序和处理任务提供高效的解决方案。
它还能减少误码和数据传输失败的可能性,提高系统的可靠性。
跨时钟域处理可以用来支持不同的芯片,例如处理器,存储器,收发器,传感器和控制器,以及他们之间的交互。
在某些情况下,它还可以用来将外部固件与主CPU和内部芯片相结合,以便在主CPU芯片和外部芯片之间传输数据,从而形成更复杂的系统集成解决方案。
跨时钟域处理可以采用端口技术来使不同频率的时钟芯片能够正常工作。
每个芯片都有一个专用端口,它可以产生和接受数据,而不受另一个芯片的时钟频率的限制。
例如,如果一个芯片使用的是200MHz的时钟,而另一个芯片使用的是2GHz的时钟,那么使用端口就可以让这两个芯片能够正常工作,而不用担心后者会干扰前者的时钟。
另一种常用的跨时钟域处理技术是串行总线技术。
它允许多个晶体管和元件在共享的串行总线上通信,而无需考虑他们之间的时钟频率和时差。
这种技术允许用户更容易地访问和控制每个芯片的信号,而无需考虑时钟延时的问题。
最后,要注意的是,跨时钟域处理技术的实施必须保证其精确性和可靠性。
这可能会需要使用适当的补偿和专用控制系统,以确保系统中的所有芯片能够正常工作并保持稳定性,从而实现最佳性能和可靠性。
总之,跨时钟域处理是一种令人印象深刻的技术,可以有效地减少成本,并提高系统性能,提高可靠性和性能。
此外,它还可以帮助更快地推向市场新产品,提高市场竞争力。
企业应该利用跨时钟域处理技术,以更有效的方式来解决问题,实现更多的目标。
1引言可编程系统芯片SOPC的设计过程中经常会遇到如磁盘控制器、CD/DVD-ROM控制器、调制解调器、网络处理器等不同模块或系统间的数据传输。
不同功能模块之间往往使用不同的时钟频率,各模块控制信号和数据总线上的传输速度不同,造成模块接口界面处数据传输速率不匹配,导致控制信息或数据的传输产生错误或者丢失,降低了数据传输的可靠性。
因此,研究采用不同时钟的异步电路SOPC即多时钟域中如何有效实现模块互连、保证数据可靠传输是SOPC设计必须关注的问题。
在SOPC设计中,一般遵循的基本原则是模块内部采用同一时钟,即同步电路的设计策略,各模块间通过接口互连。
只要接口符合异步系统要求的时序规范即可实现多时钟域数据的可靠传输,这是多时钟域SOPC设计的前提。
2多时钟域数据传递中需要解决的主要问题2.1多时钟域数据传输中亚稳态的产生触发器是SOPC设计的基本时序元件,其基本参数是数据的建立时间T set和保持时间T hold。
对于上升沿触发的触发器,T set就是在时钟上升沿到来之前触发器数据端数据保持稳定的最小时间。
而T hold是时钟上升沿到来之后触发器数据端数据继续保持稳定的最小时间。
在T set/T hold之间的小“窗口”内,数据是不确定的,使得触发器工作在一个数据不确定的状态,这种不确定状态就称为亚稳态[1]。
要保证触发器正常工作,在时钟上升沿前后的这个“窗口”内,触发器数据端数据应该保持不变,否则触发器就会进入亚稳态,如图1所示。
单时钟域(同步)电路设计中,由于不存在时钟之间的偏移(Skew)和错位,所以T set和T hold的约束容易满足。
而在有非同步性模块的多时钟域里,若两个时钟之间频率、相位差始终是变化的,即时钟之间的延迟不确定,不同时钟间直接传输数据时T set和T hold的约束难以保证,就会导致亚稳态。
亚稳态产生的数据传输失真还可能导致连锁反应,使整个SOPC系统功能失常。
SOPC设计必须尽量避免亚稳态。
跨时钟域信号同步解决⽅案为了确保拥有多个异步时钟域的系统级芯⽚(SoC)能够可靠运⾏,设计⼈员必须使这些跨越了多个域的时钟和数据信号保持同步。
尽管这并不属于新提出的要求,但随着多时钟域越来越常见和复杂化,使得这⼀要求具备了新的重要意义。
⼤规模集成加上对性能的严格要求以及频率调节都导致在许多不同频率下发⽣了很多时钟域跨越现象-就像⼀场数字化的“完美风暴”。
跨时钟域(CDC)问题会以许多种形式出现,其评估难度相当⾼。
幸好,Synopsys DesignWare库产品提供了许多卓越的CDC解决⽅案,这些⽅案应⽤简便,设计⼈员只需掌握在何时以及何处应⽤它们即可。
本⽂解释了在时钟和数据信号从⼀个时钟域跨越到另⼀个时钟域时所发⽣的许多类型的同步问题。
在任何情况下,本⽂所包含的问题都涉及到相互异步的时钟域。
随着每⼀个问题的提出,本⽂将概述⼀个或多个DesignWare解决⽅案。
这些主题和解决⽅案包括:●基本同步—DW_sync●临时事件同步—DW_pulse_sync, DW_pulseack_sync●简单数据传输同步—DW_data_sync, DW_data_sync_na,DW_data_sync_1c●数据流同步—DW_fifo_s2_sf, DW_fifo_2c_df,DW_stream_sync●复位排序—DW_reset_sync●相关时钟系统数据同步—DW_data_qsync_hl,DW_data_qsync_lh1基本同步问题当来⾃⼀个时钟系统的信号将⽤作另⼀个与其不同步的时钟系统的输⼊时,就需要对信号进⾏同步以达成。
⽽不进⾏同步就⽆法达成时序收敛。
图1所⽰为采⽤⼀个单寄存器来同步⾄⽬的时钟域的异步输⼊。
伴随这种⽅法会出现的⼀个问题是,当⼀个触发器的数据输⼊处于逻辑0⾄逻辑1之间的过渡过程当中时,发给这个触发器时钟信号时有可能产⽣亚稳态现象。
亚稳态现象也有可能发⽣在触发器的建⽴时间或保持时间出现违反现象时。
中断跨时钟域同步处理
中断跨时钟域同步处理是指在多个时钟域之间进行数据传输或
处理时,中断信号的同步处理方式。
在处理器芯片中,有时需要在不同的时钟域之间传输数据或进行中断处理,但由于时钟信号的不同步,可能会产生数据错位或其他错误。
因此,需要采取一些同步处理方式,以确保数据的正确性和系统的稳定性。
中断跨时钟域同步处理的具体实现方式,可以采用硬件同步器、FIFO缓存器、同步锁等方式,通过对数据进行缓存、锁定等操作,
来保证数据的同步性和正确性。
同时,还需要考虑时钟域之间的时序关系和时钟周期的差异,以确保数据传输的时序正确。
在实际的系统设计中,中断跨时钟域同步处理是一个非常重要的技术,它关系到系统的稳定性和性能。
因此,需要对不同的系统架构和应用场景进行深入的研究和探索,以找到最优的同步处理方式。
同时,还需要注意中断信号的传输速率和中断响应时间等因素,以确保系统的实时性和可靠性。
- 1 -。