A synthesizable IP Core for DVB-S2 LDPC Code Decoding
- 格式:pdf
- 大小:93.17 KB
- 文档页数:6
基于FPGA的DVB-S2中IRA码编码器设计实现摘要:介绍了一种用FPGA实现DVB-S2中IRA码编码器的设计方法。
设计采用RAM组和FIFO组配合使用的方法,有效解决了校验矩阵储存和校验位生成等难点问题。
用Verilog语言实现了DVB-S2的编码器,得到的FPGA综合报告表明,在占用硬件资源不大的条件下,编码器符合DVB-S2标准的要求,能够被标准所运用。
关键字:DVB-S2;IRA码;FPGA;编码器1、引言DVB-S2[1]是欧洲数字视频广播(DVB)组织在2004年制定的第二代卫星广播标准。
DVB-S2标准采用BCH码与LDPC码级联的前向纠错(FEC)系统,其性能接近理论极限。
DVB-S2提供了11种纠错编码比率,以适应不同的调制方式和系统需求,并引入了64800和16200两种LDPC码长,码长极长是其性能优异(距香农限仅0.7dB,比DVB-S标准提高了3dB)的原因之一。
LDPC 码的编码通常非常复杂,其复杂度与码长的平方成正比。
DVB-S2的编码实现对于整个标准的开发运用和推广起着十分重要的作用,本文基于这一原因,对DVB-S2标准中的LDPC码的编码规则进行了深入研究,针对其特点用FPGA实现其编码系统,在占用较少硬件资源的条件下,核心的编码处理频率为63.725MHz,可以满足DVB-S2标准应用的需要。
2、DVB-S2中的LDPC码编码方法根据DVB-S2标准,其LDPC码的编码任务是由个信息位得到个奇偶校验位,最后得到码字。
具体过程概括为:①初始化校验位:。
②计算信息位对奇偶校验位的贡献,计算公式为,(1)其中,是第个校验位,是第个信息位,是奇偶校验位的个数。
表示奇偶校验位的地址取DVB-S2标准附录B和C 提供的相应地址列表的第行的数据。
是由码率R决定的常量,计算公式为:DVB-S2标准中给出了长码和短码对应的不同码率的值。
从这一步可以看出,DVB-S2中的码有周期为360的循环结构,极大程度降低了编译码复杂度,且有利于硬件实现。
Industry Observation ........................产业观察2网管系统发展过程网管系统经过多年才发展到现在,它从功能单 一到功能全面,网络结构随着播出设备的变化,经 历了由简单到复杂、再到简洁的过程。
总局第62号 令发布后,给我台卫星上行系统和网管系统又提出 了许多新要求,也为网管系统部分改造提供了动力 和方向,下面介绍网管建设的三个重要时期。
2.1初上网管当时是我台第一次安装网管系统称数字卫星监 控系统,它;;2004年4月数字化设备安装运行后,于2005年10月在地球站各种硬件的基础上开发的。
当时我台是数字、模拟信号同时并发,网管也 只能对数字上行系统进行管理,并且,由于基带部 分(编码器、复用器)都是哈雷公司的产品,具有独 立网管不共享数据,因此,没有基带部分。
主要功能 是监视设备当前状态,读取各个设备的各种数据信 息;通过软件可以更改设备参数;日常管理可进行日 常报表查询、保存和用户权限管理等操作。
2.2第一次网管重大改造主要有以下原因:由于模拟上行系统的停发和 总局对安全播出的要求,新增加了一套上行设备,网络管理内容增加彳多;网管经过几年的运行,存在 一些问题,我们也希望增加一些功能,比如波形监 视、电源的监控、空调监控等;按总局要求地球站要 上传上行系统的实时运行数据,因此要把基带网管 也拿入地球站网管(只能读取数据,不能控制它);希望网管界面升级,使它更美观、更好操作。
于2009年元月到4月我台与北京某公司合作对 地球站网管系统进行了升级改造,历时3个月,达到预期的效果。
2.3第二次重大改造2009年12月,总局下发了国家广播电影电视总 局《广播电视安全播出管理规定》第62号令,按照 总局的要求,我台卫星上行系统设备配置和设计必 须达标的要求,我台根据我们的实际情况主要完成 两个上行系纟分离,实现独立工作彼此互不影响,网管也必须进行全面改造升级。
图上标注的为7075k V A,按照2/3的容量以及励磁涌流为额定电 流6倍的情况,不难求出,重新合阐的瞬间,励磁电流的最高值可 以达到1634A。
下面,还可以通过励磁涌流衰减过程的计算方法,就是励磁 涌流幅值与间断角随时间变化的计算公式,来求出励磁涌流随着 时间的变化,其本身会发生什么样的变化。
在正式列出公式前,可 以先来设定几个值:k为励磁周期数,k=l,2,3…;Ik为第k个 周期时,励磁涌流的幅值;I。
为第一个周期中励磁涌流的幅值;0 j.k为间断角,那么就可以将公式列为[2]:It«= [(1+cosC 0 j.k/2)) /(1+cosC 0 j.〇/2)) ]*I 由于励磁涌流摔跤需要明显的实践过程,所以可以将第一个 周期中的励磁涌流衰减近似成0,之后还可以用变压器衰减励磁 涌流的相关计算方式来对每一个工频周期里面的间断角进行推 算,在第一个工频周期中,其间断角为121. 86° ;第五个中的间 断角为217. 01° ;第十个为262. 48°。
这样一来,就可以计算出每一个贡品中期中馈线开关533 到开关B之间的励磁涌流幅值,比如,第五个工频周期幅值是 751A,第十个则是375A。
所有,这样再结合图一就能看出,在第五 个工频周期中励磁涌流达到了 751A,已经远远超过了开关A的安全定值600A/0. Is。
所以才造成了开关A跳闸现象的发生。
3各级开关配合问题在变电站中,10k V线路产生故障时之所以会造成多级别开 关共同跳闸现象的发生,主要是因为线路上的开关存在着一定的 动作特性,从而在整条线路的末端出现短路现象发生时,就会造 成馈线开关与短路开关之前的线路开关一同发生跳闸现象。
励磁 涌流的变化与时间有关,所以通过这一点就可以大致估算出在线 路合闸后每个不同的工频周期中励磁涌流的幅值,这样就可以有 效的判断出跳闸开关的定值与动作时限的安排是否合理。
DVB-S2和DVB-S究竟有什么区别摘要DVB-S.2作为新一代数字卫星广播标准即将出台,草案已正式发布,新标准在提升原有信道传输容量的同时,还将大大拓展业务范围,得到了广电、电信、计算机等领域的广泛关注。
在与以往标准相比较的基础上,本文阐述了新标准技术上的主要优势,并简要介绍了标准的研发背景、目前的进展及未来应用前景。
一数字卫星广播标准的发展沿革与DVB-S.2数字卫星广播标准发展始于1990年代初,应用较多的制式主要有两种,即欧洲的DVB-S 标准和美国GI公司开发的Digicipher标准,两种方式互不兼容,其差别主要在于数字信号的传输方式即信道编码,而信源编码部分都采用了MPEG-2。
从欧洲电信标准协会(ETSI)的ETS 300 421算起,DVB-S作为当今广播电视领域的主流卫星传输标准,问世已逾十年,在世界范围内得到广泛应用。
1995年中央电视台通过卫星播出数字压缩加扰电视节目时,我国尚未公布将DVB-S作为试行标准,当时采用了美国GI的Digicipher系统,随着近年来国内模拟卫星传输方式淡出市场,国内各上星频道普遍采用了DVB-S技术。
十年的使用期同时意味着DVB-S的核心技术与当今相关领域的前沿技术水平渐行渐远,因此,基于当前硬件支持能力和编码算法的最新成果,开发更适应当前乃至未来中长期业务发展需求的技术标准就成为当务之急,DVB-S.2也因此呼之欲出。
DVB-S.2由JTC(联合技术委员会)制定,JTC最初于1990年由EBU(欧广联)、ETSI 联合组建,负责制定广播电视及相关领域的技术标准,1995年该组织吸纳CENELEC(欧洲电工标准化委员会)加入,后者负责广播电视接收机方面的标准化工作。
DVB-S.2的制定采用ETSI的“两步式” 程序,2004年6月,公开发布DVB-S.2草案(即Draft ETSI EN 302 307 V1.1.1),目前进入公开的意见征询阶段(2004.6.2~2004.10.1)。
DVB-S2 调制器FPGA设计⽅案1. 功能⽬标根据EN302307v121标准,实现DVB-S2发送的基带处理部分。
该产品也可以作为⼀个独⽴的DVB-S2调制器FPGA IP core 使⽤。
2. 系统框图整个系统框图如下所⽰,⽬前实现CCM模式下的⼴播⽅式发送(QPSK/8PSK)。
3. 系统实现系统使⽤统⼀的时钟Fsys =Fsymbol*N,其中Fsymbol为符号率,N为倍频数,根据符号率所处的频段各不相同,M为每个符号进⾏星座映射的⽐特,对QPSK为2,8PSK为3,保持数据输⼊的速率⼤于发送的数据速率。
并通过填充程度,控制数据输⼊的节奏。
BCHcoder使能且数据有效后两个时钟,开始输出有效数据。
ldpc_codec第⼀次使能后12个时钟后请求输⼊数据,输⼊数据有效后两个时钟,开始输出有效数据。
4. 各模块⼯作频率Band width DDS_out =FsymbolDDS_clk DAC内部倍频Fsys(gclk)=Fsymol*NCLK2(fclk)内插系数最⼩字节读取时钟024 =< ~< 48Fsymbol*24Fsymbol*4Fsymbol*22012 =< ~< 24Fsymbol*24Fsymbol*4Fsymbol*22161 6 =< ~< 12Fsymbol*48Fsymbol*8Fsymbol*442 3 =< ~< 6Fsymbol*88Fsymbol*16Fsymbol*8831.5 =< ~< 3Fsymbol*168Fsymbol*32Fsymbol*16165. 基带成帧对于CCM模式下的⼴播⽅式,它主要包含对TS包的CRC-8的计算和插⼊,以及包的分⽚(slicer),以及基带帧头(80 bit)的插⼊,最后进⾏并/串转换。
6. ⽐特交织现在流的⼆代卫星传输标准DVB-S2(ETSI EN 302 307)中,对经过编码后的数据流⽐特交织要求如下:⽐特交织器结构调制⽅式⾏数(长帧)⾏数(短帧)列数8PSK216005400316APSK162004050432APSK12960324057. 物理成帧主要包含对输⼊数据及导频的扰码,以及物理帧头(90 symbol)的插⼊。
A synthesizable IP Core for DVB-S2LDPC Code DecodingFrank Kienle,Torben Brack,Norbert WehnMicroelectronic System Design Research GroupUniversity of KaiserslauternErwin-Schr¨o dinger-Straße67663Kaiserslautern,Germany{kienle,brack,wehn}@eit.uni-kl.deAbstractThe new standard for digital video broadcast DVB-S2 features Low-Density Parity-Check(LDPC)codes as their channel coding scheme.The codes are defined for various code rates with a block size of64800which allows a trans-mission close to the theoretical limits.The decoding of LDPC is an iterative process.For DVB-S2about300000messages are processed and reordered in each of the30iterations.These huge data processing and storage requirements are a real challenge for the de-coder hardware realization,which has to fulfill the specified throughput of255MBit/s for base station applications.In this paper we will show,to the best of our knowledge, thefirst published IP LDPC decoder core for the DVB-S2 standard.We present a synthesizable IP block based on ST Microelectronics0.13µm CMOS technology.1IntroductionThe new DVB-S2standard[2]features a powerful for-ward error correction(FEC)system which enables trans-mission close to the theoretical limit(Shannon limit).This is enabled by using Low-Density Parity-Check(LDPC) codes[3]one of the most powerful codes known today which can even outperform Turbo-Codes[4].To provide flexibility11different code rates ranging from(R=1/4up to9/10)are specified with a codeword length up to64800 bits.This huge maximum codeword length is the reason for the outstanding communications performance(∼0.7dB to Shannon)of this DVB-S2LDPC code proposal,so in this paper we only focus on the codeword length of64800 bits.To yield this performance,the decoder has to iterate30 times.At each iteration up to300000data are scrambled and calculated.This huge data processing,storage and net-work/interconnect requirements is a real challenge for the decoder realization.A LDPC code can be represented by a bipartite graph. For the DVB-S2code64800so called variable nodes(VN) and64800∗(1−R)check nodes(CN)exist.The connec-tivity of these two type of nodes is specified in the standard [2].For decoding the LDPC code messages are exchanged iteratively between this two type of nodes,while the node processing is of low complexity.Within one iterationfirst the variable nodes are procesed,then the check nodes.For a fully parallel hardware realization each node is instantiated and the connections between them are hard-wired.This was shown in[5]for a1024bit LDPC code. But even for this relatively short block length severe rout-ing congestion problems exist.Therefore a partly paral-lel architecture becomes mandatory for larger block length, where only a subset of nodes are instantiated.A network has to provide the required connectivity between VN and CN nodes.But realizing any permutation pattern is very costly in terms of area,delay and power.To avoid this problem a decoderfirst design approach was presented in[6].First an architecture is specified and afterwards a code is designed whichfits this architecture. This approach is only suitable for regular LDPC code where each VN has the same number of incident edges,the CN respectively.But for an improved communications perfor-mance so called irregular LDPC codes are mandatory[7], where the VN nodes are of varying degrees.This is the case for the DVB-S2code.In[8]we have presented a design method for irregular LDPC codes which can be efficiently processed by the decoder hardware.We used so called ir-regular repeat accumulate(IRA)codes[9]which are within the class of LDPC codes with the advantage of a very sim-ple(linear)encoding complexity.In general,LDPC code encoder are very difficult to implement due to the inherent complex encoding scheme.The LDPC codes as defined in the DVB-S2standard are IRA codes,thus the encoder realization is straight for-ward.Furthermore,the DVB-S2code shows regularities which can be exploited for an efficient hardware realization.Figure1.Tanner graph for the DVB-S2LDPCcodeThese regularities are also the base for our methodology in-troduced in[8].In this paper we show how to exploit these regulari-ties and present an efficient mapping of VN and CN nodes to hardware instances.Memory area and access conflicts are most critical in this mapping process.Thus we used simulated annealing to minimize memory requirements and avoidance of RAM access conflicts.We present to the best of our knowledge thefirst IP core capable to process all specified code rates in the DVB-S2 standard.We show synthesis results using a0.13µm tech-nology.The paper is structured as follows:the DVB-S2LDPC codes and the decoding algorithm are presented in Sec-tion2.In Section3the mapping of nodes to hardware instances is explained.The overall decoder architecture is shown in Section4.Section5gives synthesis results and Section6concludes the paper.2DVB-S2LDPC CodesLDPC codes are linear block codes defined by a sparse binary matrix(parity check matrix)H.The set of valid codewords x∈C have to satisfyHx T=0,∀x∈C.(1)A column in H is associated to a bit of the codeword and each row corresponds to a parity check.A nonzero ele-ment in a row means that the corresponding bit contributes to this parity check.The code can best be described by a Tanner graph[7],a graphical representation of the as-sociations between code bits and parity checks.Code bits are shown as variable nodes(circles),and parity checks as check nodes(squares),with edges connecting them.The number of edges on each node is called the node degree.If the node degree is identical for all variable nodes,the cor-responding parity check matrix is called regular,otherwise it’s irregular.By carefully inspection of the construction rules,the DVB-S2parity check matrix consists of two distinctiveRate j f j f3k N K1/412540010800449600162001/312720014400543200216002/512864017280638880259201/281296019440732400324003/51212960259201125920388802/3134320388801021600432003/4125400432001416200486004/5116480453601812960518405/6135400486002210800540008/94720050400277200576009/10464805184030648058320Table1.Parameters describing the DVB-S2LDPC Tanner graph for different coderates parts:a random part dedicated to the systematic informa-tion,and afixed part that belongs to the parity information.The Tanner graph for DVB-S2is shown in Figure1.There exist two types of variable nodes,the information(IN)and parity nodes(PN),corresponding to the systematic and par-ity bits respectively.The permutationΠrepresents the ran-dom matrix part of the connectivity between IN and CN nodes,while the PN nodes are all of degree two and are connected in afixed zigzag pattern to the CN nodes.The N check nodes have a constant degree k.The K information nodes consist of two subsets f j and f3,with f the number of IN nodes of degree j and3.Table1summarizes the code rate dependent parameters as defined in the standard[2].The connectivity of the IN and CN nodes is defined by the DVB-S2encoding rulep j=p j⊕i m,j=(x+q(m mod360))mod N.(2) p j is the j th parity bit,i m the m th information code bit,and x,q,and N are code rate dependent parameters specified by the DVB-S2standard.Equation2determines the entries of the parity check matrix.The m th column has nonzero elements in each row j,thus the permutationΠgenerates one edge between every CN m and IN j.Thefixed zigzag connectivity of the PN and CN nodes is defined by the encoding scheme:p j=p j⊕p j−1,j=1,2,...,N−1.(3) This is a simple accumulator.The corresponding part of the parity check matrix has two nonzero elements in each column,forming a square banded matrix.This type of LDPC codes with this simple encoding procedure are also called irregular repeat accumulate(IRA)codes[9].2.1Decoding AlgorithmLDPC codes can be decoded using the message pass-ing algorithm[3].It exchanges soft-information iteratively 2a)b)parity node update (parallel)check node update (parallel)forward update (sequential)backward update (parallel)Figure 2.a)conventional message up-date scheme b)optimized message update schemebetween the variable and check nodes.The update of the nodes can be done with a canonical scheduling [3]:In the first step all variable nodes must be updated,in the second step all check nodes respectively.The processing of indi-vidual nodes within one step is independent,and can thus be parallelized.The exchanged messages are assumed to be log-likelihood ratios (LLR).Each variable node of degree i cal-culates an update of message k according to:λk =λch +i −1∑l =0,l =kλl ,(4)with λch the corresponding channel LLR of the VN and λi the LLRs of the incident edges.The check node LLR up-dates are calculated according totanh (λk /2)=i −1∏l =0,l =ktanh (λl /2).(5)For fixed-point implementations it was shown in [10]that the total quantization loss is ≤0.1db when using a 6bit message quantization compared to infinite precision.For a 5bit message quantization the loss is 0.1-0.2dB [7].2.2Optimized update of degree 2Parity NodesThe DVB standard supports LDPC codes ranging from code rate R =1/4to R =9/10.Each code has one common property:the connectivity of the check nodes caused by the accumulator of the 0is alway connected to CN 1by a variable node of degree 2and so on for all CN nodes.A variable node of degree 2has the property that the input of the first incident edge is the output of the second incident edge (plus the received channel value)and vice versa.For a sequential processing of the check nodes (e.g.from left to right in Figure 1)an already updated message can directly passed to the next check node due to the simple zigzag con-nectivity.This immediate message update changes the con-Rate qE PN E IN Addr 1/413597199972002701/3120863991296003602/5108777591555204321/290647991620004503/572518392332806482/360431991728004803/445323991944005404/536259192073605765/630215992160006008/920143991800005009/101812959181440504Table 2.Code rate dependent parameters,with E the number of incident edges of IN and PN nodes and Addr the number of values re-quired to store the code stuctureventional update scheme between CN an VN nodes (Equa-tion 4).The difference in the update scheme is presented in Fig-ure 2.Only the connectivity between check nodes and par-ity nodes is depicted,the incident edges from the informa-tion nodes are omitted.Figure 2a)shows the standard belief propagation with the two phase update.In the first phase all messages from the PN to CN nodes are updated,in the sec-ond phase the messages from CN to PN nodes respectively.The message update within one phase is commutative and can be fully parallized.Figure 2b)shows our new message update scheme in which the new CN message is directly passed to the next CN node.This data flow is denoted as a forward update and corresponds to a sequential message update.The backwards update from the PN to CN nodes is again a parallel update.Note that a sequential backwards update would result in a maximum a posteriori (MAP)al-gorithm.This new update scheme improves the communications performance.For the same communications performance 10iterations can be saved i.e.30iterations instead of 40have to be used.Furthermore we need to store only one message instead of two messages for the next iteration,which is explained in more detail in Section 4.3Hardware mappingAs already mentioned only partly parallel architectures are feasible.Hence only a subset P of the nodes are instan-tiated.The variable and check nodes have to be mapped on these P functional units.All messages have to be stored dur-ing the iterative process,while taking care of RAM access conflicts.Furthermore we need a permutation networks which provides the connectivity of the Tanner graph.3shuffling networkVN functionalunits message mappingCheck Nodes message mappingCN functionalunitsFigure 3.Message and functional unit map-ping for R =1/2We can split the set of edges E connecting the check nodes in two subsets E IN and E PN ,indicating the connec-tions between CN/IN nodes and CN/PN nodes respectively.The subsets are shown in Table 2for each code rate.Fur-thermore the q factor of Equation 2is listed.The implemen-tation of E IN is the challenging part,since this connectivity (Π)changes for each code rate.The realization of E PN is straightforward,thus we focus on the mapping of the IN and CN nodes.Due to the varying node degrees the functional nodes process all incoming messages in a serial manner.Thus a functional node can except one message per clock cycle and produces at most one updated message per clock cycle.A careful analysis of Equation 2shows that the connec-tivity of 360edges of distinct information nodes are deter-mined by just one value x ,while q is a code rate dependent constant,see Table 2.These 360edges can be processed simultaneously by P =360functional units.Within a half iteration a func-tional unit has to process q ∗(k −2)edges.(k −2)is the number of edges between one check node and information nodes.For each code rate q was chosen to satisfy the con-straintE IN /360=q ∗(k −2).(6)It guarantees that each functional unit has to process thesame amount of nodes which simplifies the node mapping.Figure 3shows the mapping of the IN and CN nodes for the LDPC code of rate R =1/2.Always 360consecutive VN nodes are mapped to 360functional units.To each func-tional unit a RAM is associated to hold the corresponding messages (edges).Please note that for each IN of degree 8,8storage places are allocated to this VN,because each incident edge has to be stored.The check nodes mapping depends on the rate depen-dent factor q .For R =1/2the first q =90CN nodes are mapped to the first functional unit.The next 90CN nodes are mapped to the next producer and so on.Again the CN number corresponds to CN degree storage locations.This node mapping is the key for an efficient hardware realization,since it enables to use a simple shuffling net-work to provide the connectivity of the Tanner graph.The shuffling network ensures that at each cycle 360input mes-sages are shuffled to 360distinct target memories.Thus we have to store E IN /390=450shuffling and addressing infor-mation for the R =1/2code,see Table 2for the other code rates.The shuffling offsets and addresses can be extracted from the x tables provided by [2].4Decoder ArchitectureBased on the message mapping described in the previ-ous chapter,the basic architecture of the DVB-S2LDPC decoder is shown in Figure 4.It consists of functional units which can process the functionality of variable and check nodes.This is possible,since only one type of the node are processed during one half iteration.The IN message memo-ries banks hold the messages which are exchanged between information and check nodes.Furthermore we have memo-ries for storing the exchanged messages for the parity nodes (PN message memories),which are all of degree two.The address and shuffling RAM together with the shuffling net-work provides the connectivity of the Tanner graph.As mentioned the decoder processes 360nodes in par-allel so 360messages have to be provided per cycle.All 360messages are read from the same address from the IN message memory bank.Though,for the information node processing we just increment the reading address.The func-tional unit can accept each clock cycle new data,while a control flag just labels the last message belonging to a node and starts the output processing.The newly produced 360messages are then written back to the same address loca-tion but with a cyclic shift,caused by the shuffling network.To process the check nodes we have to read from dedicated addresses,provided by the address RAM.These addresses were extracted from node mapping as described in the pre-vious chapter.Again 360messages are read per clock cycle and written back to the same address after the processing via the shuffling network.This ensures that the messages are shuffled back to their original position.The processing of the parity nodes can be done con-currently during the check node processing,by using the update scheme described in Section 2.2.Each functional4Figure4.Basic architecture of our LDPC decoderunit processes consecutive check nodes(Figure3).The message which is passed during the forward update of the check nodes is kept in the functional unit.Only the mes-sages of the backward update has to be stored which re-duces the memory requirements for the zigzag connectivity to E PN/2messages.The PN message memories are only read and written during the check node phase,while the channel(CH)RAMs delivers the corresponding received channel value.We use single port SRAMs due to area and power ef-ficiency.Hence we have to take care of read/write con-flicts during the iterative process.Read/write conflicts oc-cur,since data are continously read from the360RAMs and provided to the functional units,while new processed mes-sages have to be written back.The check node processing is the most critical part. We have to read from dedicated addresses extracted dur-ing the mapping process.Therefore,the IN message mem-ory block is partitioned in4RAMs which is shown in Fig-ure5.Even if the commutativity of the message processing within a check node is exploited all write conflicts can not be avoided.Therefore a buffer is required to hold a mes-sage if writing is not possible due to a conflict.We use simulated annealing tofind the best addressing scheme to reduce RAM access conflicts and hence to minimize the buffer overhead.This optimization step ensures that only one buffer is required which holds for all code rates.Per clock cycle we read data from one RAM,and write at most 2data back to two distinct RAMs,coming from the buffers or the shuffling network.The two least significant bits of the addresses determines the assignment to a partition.Thisallows a simple controlflow,which just has to compare the reading and the writing addresses of the current clock cycle.The resulting decoder throughput T isT=I#cyc·f cyc,(7) with I the number of information bits to be decoded and #cyc the number of cycles to decode one block including the input/output(I/O)processing.The number of cycles is calculated as CP IO+It· 2·E IN P .Thus Equation7yields:T=ICP IO+It· 2·(E IN P+T latency) ·f cycle.(8)The part CP IOis the number of cycles for input/output (I/O)processing.The decoder is capable to receive10chan-nel values per clock cycle.Reading a new codeword of sizeC and writing the result of the prior processed block can bedone in parallel with reading/writing P IO data concurrently.The latency T latency for each iteration depends on the pro-cessing units and the shuffling network.5ResultsThe LDPC decoder is implemented as a synthesizable VHDL model.Results are obtained with the Synopsis Design Compiler based on a ST Microelectronics0.13µm CMOS technology.The maximum clock frequency is270 MHz under worst case conditions.The decoder is capable to process all specified code rates of the DVB standard with5Figure5.Hierarchical RAM structurethe required throughput of255Mbit/s.30iterations are as-sumed.Table3shows the synthesis results for a6bit message quantization of the channel values and the exchanged mes-sages.The overall area is22.74mm2.The area is deter-mined by different code rates.R=1/4has the largest set of parity nodes and defines the size of the PN message memories.While the rate R= 3/5has the most edges to the informtion nodes and hence determines the size of the IN message memory banks.The size of a functional node depends on the maximum IN and PN degree,respectivly(R=2/3and R=9/10).The area is splitted in three parts:RAMs,logic and the shuffling network.Storing the messages yields the major part of the RAM area with9.12mm2.It is important to note, that only an area of0.075mm2is required to store the con-nectivity of the Tanner graph.This shows the efficiency of our architectural approach.The logic area of the func-tional nodes with10.8mm2is a major part of the overall area.This is due to the requiredflexibility of the different code rates.We also placed and routed the shuffling network to test routing congestions.Due to its regularity no conges-tions resulted,its area is dominated by the logic cells.6ConclusionLow-Density Parity-Check codes are part of the new DVB-S2standard.In this paper we presented to the best of our knowledge thefirst published IP core for DVB-S2 LDPC decoding.We explained how to explore the code structure for an efficient hardware mapping and presented a decoder architecture which can process all specified code rates ranging from R=1/4to R=9/10.0.13µm technologie AREA[mm2]channel LLRs 1.997RAMs Messages9.117Address/Shuffling0.075Logic Functional Nodes10.8control logic0.2Shuffling Network0.55Total Area[mm2]22.739Table3.Synthesis Results for the DVB-S2LDPC code decoder7AcknowledgmentsThe work presented in this paper was supported by the European IST project4More4G MC-CDMA multiple an-tenna system On chip for Radio Enhancements[1].Our special thanks goes to Friedbert Berens from the Advanced System Technology Group of STM,Geneva, Switzerland,for many valuable discussions.References[1].[2]European Telecommunications Standards Institude(ETSI).Digital Video Broadcasting(DVB)Second generation fram-ing structure for broadband satellite applications;EN302.[3]R.G.Gallager.Low-Density Parity-Check Codes.M.I.T.Press,Cambridge,Massachusetts,1963.[4] C.Berrou.The Ten-Year-Old Turbo Codes are Enteringinto Service.IEEE Communications Magazine,41:110–116,Aug.2003.[5] A.Blanksby and C.J.Howland.A690-mW1-Gb/s,Rate-1/2Low-Density Parity-Check Code Decoder.IEEE Journalof Solid-State Circuits,37(3):404–412,Mar.2002.[6] E.Boutillon,J.Castura,and F.Kschischang.Decoder-first code design.In Proc.2nd International Symposiumon Turbo Codes&Related Topics,pages459–462,Brest,France,Sept.2000.[7]T.Richardson and R.Urbanke.The Renaissance of Gal-lager’s Low-Density Pariy-Check Codes.IEEE Communi-cations Magazine,41:126–131,Aug.2003.[8] F.Kienle and N.Wehn.Design Methodology for IRACodes.In Proc.2004Asia South Pacific Design AutomationConference(ASP-DAC’04),Yokohama,Japan,Jan.2004.[9]H.Jin,A.Khandekar,and R.McEliece.Irregular Repeat-Accumulate Codes.In Proc.2nd International Symposiumon Turbo Codes&Related Topics,pages1–8,Brest,France,Sept.2000.[10]T.Zhang,Z.Wang,and K.Parhi.Onfinite precision im-plementation of low-density parity-check codes decoder.InProc.International Symposium on Circuits and Systems(IS-CAS’01),Antwerp,Belgium,May2001.6。