2012-湖南农大灵芝基因组
- 格式:pdf
- 大小:608.14 KB
- 文档页数:14
基于本草基因组学应用流式量测序技术检测人参基因组大小张小燕;刘志香;廖保生;肖水明;徐江;盛玮【摘要】Ginseng is the dried root and rhizome of Panax ginseng.The lack of genomic data has restricted the development of ginseng industry and basic research.The genome size of P.ginseng was estimated to be 3.42 Gb by using the genome data of Oryza sativa ssp.Nipponbare and Glycine max (L.) Merrill as the reference and the flow cytometricanalysis.Meanwhile,shotgun libraries with the insert size of 250 bp and 500 bp were constructed,and sequenced for double terminal PE 150 by using Illumina Hiseq X Ten platform.Totally,183.82 Gb high quality data was obtained after filtering the raw data.The genome size of P.ginseng was 3.35 Gb and the sequencing depth was 54.87 X by K-mer analysis.In this study,flow cytometry and K-mer analysis were used to identify the genome size of ginseng,which provided basic data for the further whole genome sequencing and herbgenomics studies.%人参(Ginseng)是五加科植物人参(Panax ginsengC.A.Mey.)的干燥根和根茎.由于其基因组数据的缺乏制约了人参基础研究和产业发展.本实验以水稻(Oryza sativa ssp.Nipponbare)和大豆(Glycine max (L.) Merrill)为内参,通过流式细胞术检测人参基因组大小约为3.42 Gb;同时,分别构建人参基因组插入片段大小为250 bp和500 bp的鸟枪法(shotgun)文库,利用Illumina HiseqXTen平台进行双端PE 150高通量测序,过滤原始测序数据后获得183.82 Gb高质量数据,K-mer分析法预估人参基因组大小为3.35 Gb,测序深度为54.87 X.本研究采用流式细胞术结合K-mer分析法测定人参基因组大小,为人参全基因组测序以及本草基因组学的研究提供基础数据.【期刊名称】《世界科学技术-中医药现代化》【年(卷),期】2017(019)010【总页数】5页(P1724-1728)【关键词】人参;基因组大小;流式细胞术;高通量测序;本草基因组学【作者】张小燕;刘志香;廖保生;肖水明;徐江;盛玮【作者单位】淮北师范大学生命科学学院淮北235000;中国中医科学院中药研究所北京100700;中国中医科学院中药研究所北京100700;中国中医科学院中药研究所北京100700;中国中医科学院中药研究所北京100700;中国中医科学院中药研究所北京100700;淮北师范大学生命科学学院淮北235000【正文语种】中文【中图分类】R331本草基因组学(herbgenomics)是利用组学技术研究中药基原物种的遗传信息及其调控网络,阐明中药防治人类疾病分子机制的学科,从基因组水平研究中药及其对人体作用的前沿科学[1,2]。
龙源期刊网
灵芝全基因组精细图谱发布
作者:本刊通讯员
来源:《中国中医药信息》2012年第12期
日前从国家中医药管理局亚健康干预技术实验室传出消息,我国科学家2012年5月在美国PLoS ONE杂志上发布了灵芝全基因组精细图谱,灵芝全基因组也是目前世界首个公开发表的兼性寄生药食用真菌的全基因组。
灵芝全基因组图谱的成功绘制首次实现了中药研究在基因组尺度寻找新的代谢功能基因及调控元件,为代谢网络重构及灵芝种质保护与创新奠定了基础,将推动灵芝成为第一个“模式药用真菌”,为真菌中药的研究提供系统工具,是中药现代化的又一创新成果。
此次研究由国家中医药管理局亚健康干预技术实验室主持,联合作物种质创新与资源利用国家重点实验室培育基地(湖南农业大学)、湖南万源生物科技有限责任公司、华大基因公司共同完成。
灵芝为异宗结合担子菌,为探究其遗传、变异、性征等现象,实验室以单个孢子为研究对象,首先完成了灵芝单孢子的分离、培养及萌发。
提取纯度好、浓度高的灵芝全基因组DNA,构建插入片段大小不同的文库,测定灵芝全基因组序列。
本次测序深度达灵芝基因组
大小的31倍,测序结果覆盖了93.9%的全基因组,结果表明灵芝全基因组大小为39.9兆,编码1.2万个蛋白,在灵芝全基因组图谱的基础上首次提出了完整的灵芝三萜合成途径。
我国学者陈士林课题组完成染色体水平灵芝基因组精细图谱于文静
【期刊名称】《食药用菌》
【年(卷),期】2012(000)004
【摘要】新华网记者28日从中国医学科学院药用植物研究所获悉,我国学者陈士林课题组完成染色体水平灵芝基因组精细图谱,灵芝模式物种的确立将促进我国中药学研究与现代生物学研究的结合,标志中药研究正逐步走向生命科学的前沿。
研究显示,灵芝基因组大小约43.3 Mb,由13条染色体组成,编码16,113
【总页数】1页(P225-225)
【作者】于文静
【作者单位】
【正文语种】中文
【中图分类】S56
【相关文献】
1.我国完成小麦A基因组精细图谱绘制 [J], 董瑞丰;
2.加速小麦遗传改良我国完成小麦A基因组精细图谱绘制 [J],
3.中国科学家完成世界首部植物基因组染色体图谱 [J],
4.小叶茶染色体级别基因组图谱首次完成 [J],
5.灵芝全基因组精细图谱发布 [J], 本刊通讯员
因版权原因,仅展示原文概要,查看原文内容请购买。
灵芝三萜和金银花绿原酸生物合成途径关键酶基因的挖掘及分析阐明中药生物活性成分的次生代谢途径及其调控机制是现代中药研究的重要内容之一。
次生代谢工程在提高药用植物目标产物含量上有着极大的应用潜能,但受传统研究方法和技术所限,目前我们对其次生代谢途径及调控机制的认知还较为粗略,因而期望通过次生代谢工程获取大量的目标次生代谢物仍面临着较大困难。
功能基因组学(Functional Genomics)是在结构基因组学(Structural Genomics)的基础上,利用后者提供的信息,应用高通量测序产生的大量数据,在基因组或转录组水平上全面研究基因的表达、调控与功能,并探索基因之间、基因与蛋白质之间、基因及其产物与生长发育之间的相互联系和规律。
通过全基因组学或转录组学系统分析次生代谢物的生物合成途径并发掘与之相关的基因,能够使我们更充分地了解中药的遗传信息与背景,也将使利用次生代谢工程生产重要药效成分成为可能。
灵芝和金银花是具有极高药用价值和经济价值的传统中药,灵芝三萜和绿原酸分别是灵芝和金银花的主要药效成分之一,但目前对这些重要化合物的生物合成途径仍缺少系统的研究。
本课题采用功能基因组学研究方法从全基因组水平上对药用真菌赤芝全基因组测序数据进行深入分析,以探索可能参与到灵芝三萜生物合成途径的关键酶基因;运用高通量测序技术对药用植物金银花的转录组进行研究,以阐明其重要成分绿原酸的生物合成途径,并对相关的关键基因进行克隆。
本研究为灵芝三萜和绿原酸的生物合成研究奠定了基础。
灵芝GANODERMA,又称"Lingzhi",为赤芝Ganoderma lucidum或紫芝Ganoderma sinense的干燥子实体(中国药典2010版),是我国著名的药用真菌之一。
现代药理研究表明灵芝有抗肿瘤,抗高血压,抗病毒和增强免疫能力的功效,灵芝三萜类和灵芝多糖为该功效的主要活性物质。
赤芝全基因组已由本课题组完成测序,该物种基因组大小为43.3MB,共有16,113个预测蛋白的编码基因。
灵芝ITS序列的遗传图谱分析灵芝是传统的药食兼用的真菌之一。
现代医学研究证明,灵芝含有多种生理活性物质,能够调节、增强人体免疫力,对神经衰弱、风湿性关节炎、高血压、高血脂、肝炎糖尿病、肿瘤等有良好的治疗作用,随着分子生物学及其基因工程技术的发展,人们更加注重从基因的角度来阐明灵芝起作用的原因,本实验通过对尝试灵芝菌株基因组ITS的提取和分析,力求从分子水平找出灵芝生理功能的基因序列。
在 r DNA 基因中,16S rDNA 和 28S rDNA 基因间隔序列称为(ITS),它的长度和序列变化较大,将其扩增物进行 RFLP 或序列分析,可用于对细菌的不同生物型、菌株、种、属进行分类鉴定。
甚至用于区分关系非常近的种。
在真核生物中,核糖体DNA是由核糖体基因及与之相邻的间隔区组成,其基因组序列从5’到3’依次为: 外部转录间隔区( external transcribed spacer,ETS) 、18S基因、内部转录间隔区1 ( internal transcribed spacer , ITS1 ) 、5. 8S 基因、内部转录间隔区2( ITS2 ) 、28S基因和基因间隔序列( intergenic spacer, IGS)。
核糖体DNA中的18S、5. 8S和28S的基因组序列在大多数生物中趋于保守,在生物种间变化小,而内转录间隔区ITS1 和ITS2作为非编码区,承受的选择压力较小,相对变化较大,并且能够提供详尽的系统学分析所需要的可遗传性状。
1 实验材料和方法1.1实验材料富含不同金属元素的灵芝(铁,硒)1.2 实验方法1.21 DNA的提取1.主要试剂及仪器(1)CTAB高盐提取液 50mmol・L-1Tris-HCl(pH8.0),10mmol・L-1EDTA,0.17mol・L-1NaCl体积分数1%β-巯基乙醇(使用之前加),质量浓度100g・L-1CTAB(十六烷基三甲溴化铵)。
解码灵芝基因建立首个药用模式真菌作者:来源:《科技创新与品牌》2013年第03期随着人类基因组计划的完成和后基因组时代的到来,模式生物研究策略逐渐受到各国生物学家们的关注和重视,谁先获得一个物种的基因组信息,谁就能抢占自主创新的先机。
记者了解得知,在药用植物和中药资源领域,国际上一直没有建立药用模式物种,这在一定程度上限制了我国中药及天然药物相关研究的发展。
不过,去年《自然》子刊 Nature Communications刊登的一篇关于基因组解析推动灵芝成为药用模式真菌的论文,填补了这一研究领域的空白。
这篇文章称,中国医学科学院药用植物研究所的科研团队与美国、法国等国家的科研单位合作,成功解析灵芝基因组,使其成为中药活性成分生物合成研究的首个药用模式真菌,同时也是首个具有全基因组图谱的药用模式物种,该论文被Nature网站选为中国最佳研究亮点推介。
为了深入了解这次研究以及有关灵芝药用模式真菌的相关信息,近期,本刊特别采访了承担此次研究任务的课题组负责人陈士林博士。
陈士林博士现任中国中医科学院中药研究所所长、首席研究员,曾担任中国医学科学院药用植物研究所所长,拥有丰富的中药资源研究经验。
作为“中药资源学”教育部长江学者与创新团队的带头人、“濒危药材繁育”国家工程实验室主任,他曾在国际上首次提出并验证ITS2序列作为药用植物鉴定的通用条形码,完成1100个常用中药材物种的DNA条形码鉴定研究,为中药材建立了通用的“分子鉴定身份证”。
以ITS2为主体的中药材DNA条形码鉴定新方法体系已获准纳入《中国药典》,从基因层面为行业解决了“中药材与其常见混伪品的物种识别”问题。
同时,他负责的本草基因组计划极大地推动了前沿生命科学技术在药用植物和中药领域的应用,为阐明药用植物有效成分合成和调控奠定了基础,有利于促进中药与天然药物的筛选和生物合成研究,加速中药材优良品种的选育并促进中药农业的科学化和规模化发展。
据记者了解,课题组这次之所以在众多药用植物中选择灵芝作为研究对象,是因为灵芝素有“仙草”的美誉,具有抗癌、抗高血压、抗病毒和免疫调节等活性,被喻为生产活性化合物的细胞“工厂”。
灵芝转化体系的建立孙墨可;李晓薇;孙晓文;张曼;田娟;董玉迪;李海燕【摘要】利用TPS法提取平菇基因组,克隆甘油醛-3-磷酸脱氢酶启动子(glyceraldehyde-3-phosphate dehydrogenase,GPD),用GPD启动子替换原表达载体pCAMBIA1302中的35s启动子,构建pCAMBIA1302表达载体,利用PEG 法转化灵芝原生质体.通过酶切验证与测序结果表明GPD启动子已成功构建到表达载体pCAMBIA1302中,通过灵芝转化子的PCR检测和荧光显微镜观察证明绿色荧光蛋白(green fluorescent protein,GFP)已转入到灵芝中.该研究成功构建了灵芝转化体系,对今后开展灵芝的分子生物学研究具有重要意义.【期刊名称】《安徽农业科学》【年(卷),期】2018(046)026【总页数】4页(P91-93,107)【关键词】灵芝;表达载体;PEG;转化;GFP【作者】孙墨可;李晓薇;孙晓文;张曼;田娟;董玉迪;李海燕【作者单位】吉林省白城市农业科学院生物技术实验室,吉林白城137000;吉林农业大学生物反应器与药物开发教育部工程研究中心,吉林长春130118;吉林省镇赉县芦苇湿地管理站,吉林镇赉137300;吉林省白城市农业科学院生物技术实验室,吉林白城137000;吉林省白城市农业科学院生物技术实验室,吉林白城137000;吉林省白城市农业科学院生物技术实验室,吉林白城137000;吉林农业大学生物反应器与药物开发教育部工程研究中心,吉林长春130118【正文语种】中文【中图分类】S567.3+1灵芝是我国的传统药用真菌,在我国和东南亚国家有着悠久的应用历史[1]。
灵芝含有多糖、萜类化合物、核苷、生物碱、氨基酸多肽、微量元素等多种活性成分。
现代药理研究表明,灵芝具有保肝、抗肿瘤、抗HIV-1及HIV-1蛋白酶活性、抗组织胺释放、抑制血管紧张素、抗氧化、调节免疫及延缓衰老的作用[2-7]。
药植所陈士林课题组确立灵芝中药研究模式物种地位中国医学科学院药用植物研究所联合中国中医科学院、美国田纳西大学健康科学中心、美国威斯康星大学麦迪逊分校、法国埃克斯-马赛大学、美国国家综合进化中心等国际团队,应用光学图谱和新一代测序技术,全面解析灵芝的基因组图谱,深入探讨了灵芝药用活性物质的合成途径,首次提出将灵芝作为中药研究的模式物种,得到国际同行认可,该成果已于2012年??月??日在《自然》杂志子刊《自然-通讯》以特别图片(Feature Image)形式在线发表。
研究揭示灵芝基因组大小43.3Mb,包含16113个编码基因,发现灵芝富含CYP450、转录因子和转运蛋白等次生代谢产物合成相关蛋白,表明灵芝具有合成多种次生代谢产物的巨大潜力。
其中,78个CYP450基因与羊毛甾醇合酶共表达,可能在灵芝酸合成中起关键作用。
灵芝次生代谢模式与其生长发育密切相关,研究发现一类真菌发育相关蛋白-velvet家族蛋白与三萜合成具有潜在关联,为担子菌次生代谢产物调控开启了新思路。
中药作为最重要的传统药物,正逐渐得到国际科学界的认可。
近年来,疟疾治疗特效药青蒿素获得拉斯克奖并被Natrue Medicine,Cell评述,黄黛片治疗急性早幼粒细胞白血病,四君子汤降低化疗肠道毒性灵等中药复方疗效文章相继被PNAS,Scinece Translation Medicine等国际权威杂志报道,说明中药的有效性已经为国际医疗界广泛接受。
但是,对于中药材的道地性仍缺乏高水平的研究和权威的报道。
灵芝作为传统中药的代表,其高质量基因组图谱及相关分析的完成,是继砒霜成药即三氧化二砷治疗白血病,青蒿素治疗疟疾之后获得国际认可,在国际一流杂志发表的又一中药研究案例,是中药材研究首次站在国际前沿。
陈士林课题组同时发现中国灵芝与北美灵芝在非核糖体肽合成途径,聚酮合成途径等多个次生代谢产物合成途径都有很大差异,同时,北美灵芝缺乏一个抗肿瘤的重要蛋白LZ-8,说明道地药材在基因组水平上与非道地药材存在天然差别,为阐明道地药材形成的生物学实质提供了理论支撑。
The Genome of Ganderma lucidum Provide Insights into Triterpense Biosynthesis and Wood DegradationDongbo Liu1,2.,Jing Gong3,6.,Wenkui Dai4.,Xincong Kang1,2.,Zhuo Huang5,Hong-Mei Zhang3,6, Wei Liu3,6,Le Liu4,Junping Ma4,Zhilan Xia1,2,Yuxin Chen1,2,Yuewen Chen1,2,Depeng Wang6,7, Peixiang Ni4,An-Yuan Guo3*,Xingyao Xiong1,8*1Hunan Agricultural University,Changsha,Hunan,China,2State Key Laboratory of Sub-health Intervention Technology,Changsha,Hunan,China,3Hubei Bioinformatics and Molecular Imaging Key Laboratory,College of Life Science and Technology,Huazhong University of Science and Technology,Wuhan,Hubei,China,4Beijing Genome Institute(BGI-Shenzhen),Shenzhen,Guangdong,China,5Hunan Wanyuan Bio-tech Co.,Ltd.,Changsha,Hunan,China,6Nextomics Biosciences Co.,Ltd.,Wuhan,Hubei, China,7School of Bioscience and Bioengineering,South China University of Technology,Guangzhou,Guangdong,China,8Key Laboratory for Crop Germplasm Innovation and Utilization of Hunan Province,Hunan Agricultural University,Changsha,Hunan,ChinaAbstractBackground:Ganoderma lucidum(Reishi or Ling Zhi)is one of the most famous Traditional Chinese Medicines and has been widely used in the treatment of various human diseases in Asia countries.It is also a fungus with strong wood degradation ability with potential in bioenergy production.However,genes,pathways and mechanisms of these functions are still unknown.Methodology/Principal Findings:The genome of G.lucidum was sequenced and assembled into a39.9megabases(Mb) draft genome,which encoded12,080protein-coding genes and,83%of them were similar to public sequences.We performed comprehensive annotation for G.lucidum genes and made comparisons with genes in other fungi genomes.Genes in the biosynthesis of the main G.lucidum active ingredients,ganoderic acids(GAs),were characterized.Among the GAs synthases,we identified a fusion gene,the N and C terminal of which are homologous to two different enzymes.Moreover,the fusion gene was only found in basidiomycetes.As a white rot fungus with wood degradation ability, abundant carbohydrate-active enzymes and ligninolytic enzymes were identified in the G.lucidum genome and were compared with other fungi.Conclusions/Significance:The genome sequence and well annotation of G.lucidum will provide new insights in function analyses including its medicinal mechanism.The characterization of genes in the triterpene biosynthesis and wood degradation will facilitate bio-engineering research in the production of its active ingredients and bioenergy.Citation:Liu D,Gong J,Dai W,Kang X,Huang Z,et al.(2012)The Genome of Ganderma lucidum Provide Insights into Triterpense Biosynthesis and Wood Degradation.PLoS ONE7(5):e36146.doi:10.1371/journal.pone.0036146Editor:John Parkinson,Hospital for Sick Children,CanadaReceived January30,2012;Accepted March26,2012;Published May2,2012Copyright:ß2012Liu et al.This is an open-access article distributed under the terms of the Creative Commons Attribution License,which permits unrestricted use,distribution,and reproduction in any medium,provided the original author and source are credited.Funding:Funding of this project was provided by the programs National Program on Key Basic Research Project(973Program,2012CB723000to DBL),Key Projects in the National Science&Technology Program(2012BAD33B00to DBL),and National Natural Science Foundation of China(31171271to AYG).The funders had no role in study design,data collection and analysis,decision to publish,or preparation of the manuscript.Competing Interests:The authors have the following interests to declare:JG,HMZ,and WL are interns of the Nextomics Biosciences company and DW is an employee of this company.ZH is an employee of the Hunan Wanyuan Bio-tech company.There are no patents,products in development or marketed products to declare.This does not alter the authors’adherence to all the PLoS ONE policies on sharing data and materials.*E-mail:guoay@(AG);xiongxy@(XX).These authors contributed equally to this work.IntroductionGanoderma lucidum(Leyss.ex.Fr)Karst.,Ling-Zhi in Chinese and Reishi in Japanese,belonging to the Ganodermataceae of Aphyllophorales in Basidiomycetes[1],is a widely distributed fungus in the tropic and subtropics of Asia,Africa and America [2],with the most diversities in China.G.lucidum is one of the most famous Traditional Chinese Medicines and has been widely used as a tonic for longevity and overall health in China for thousands of years[3].G.lucidum has been proved with remarkable pharmacological activities and therapeutic effects in immuno-modulation,anti-cancer,anti-radiation and detoxification for various human diseases[4–7].However,the accumulation of its active ingredients and the pharmacological mechanisms are mainly unknown.The genome sequence and gene annotation of G.lucidum will provide key resources and may speed up the function research of G.lucidum to human health.Ganoderic acids(GAs),one of the main active ingredients of G. lucidum,are a kind of triterpenoid secondary metabolites and shown the ability to participate in many biological activities including antitumor,antioxidant,etc.[8].However,the content of GAs is very low and is suggested to be the quality indicator of G. lucidum in Japan[1,9].It is suggested that the triterpene backbone of GAs could be biosynthesized via the mevalonic acid(MVA) pathway.Several genes in this pathway have been cloned in G. lucidum,including3-Hydroxy-3-methylglutaryl-CoA reductase (HMGR)[10],Farnesyl diphosphate synthase(FPPs)[11],Squalene synthase(SQS)[12],and Lanosterol synthase(also namely2,3-oxidosqualene lanosterol cyclase,OSC)[13].How-ever,it rarely reported about the processes of decoration after the triterpene backbone biosynthesis,such as cyclization and glyco-sylation,which are very important for GAs synthesis.The genome sequence is expected to characterize the enzymes of these key steps in the GAs biosynthesis.G.lucidum is one of the white-rot fungi that grow on the dead trees by degrading cellulose,hemicellulose and lignin.Lignin,one of the main polymeric components of plant cell wall,is highly resistant to chemical and biological degradation[14].Although there are some reports mentioned ligninolytic enzymes,the mechanism of lignin degradation is still not fully understood [14–16].In addition,different enzymatic systems are employed in different fungi[17].As one of the dominant organisms decom-posing lignocellulose,it would be interesting to figure out the enzymatic system and genes of G.lucidum in wood degradation. With the development of next-generation DNA sequencing, several macrofungi have been sequenced and analyzed to illuminate different aspects.Ohm et al.[18]studied the fruiting bodies formation and lignocelluloses degradation of Schizophyllum commune.Stajich et al.[19]completed the chromosome assembly of Coprinopsis cinerea,and investigated the meiotic recombination, genes and gene families and so on.Martin et al.illustrated the different ways of genetic predisposition for symbiosis in basidio-mycete Laccaria bicolor[20]and ascomycete Tuber melanosporum Vittad.[21].With these fungi genomes,it is possible to make full annotation and comparison for G.lucidum genomes.The genome annotation of G.lucidum will provide important data to furtherfunction and mechanism research in G.lucidum and comparative genomics in fungi.In this study,we sequenced the genome of monokaryotic G. lucidum strain isolated from China and assembled a39.9Mb genome.We made full annotations with the predicted genes in this genome and compared them with other fungi genomes.With integrated gene prediction and annotation,we illuminated the synthesis of GAs as a model system to study triterpenoid biosynthesis in fungi.Besides the importance of understanding the biosynthesis of this active ingredient,insights into the enzyme systems of lignocelluloses degradation in G.lucidum may speed up the process of understanding the lignocelluloses degradation mechanism for bioenergy applications.ResultsThe genome characteristics of G.lucidumThe genome of monokaryotic G.lucidum was sequenced by whole genome shotgun strategy and produced3,738Mb clean data after filtering low quality and adapter contamination reads. The assembly was performed by SOAPdenovo genome assembler [22],firstly generated1,724contigs with N50of80,796base pairs (bp)and then assembled into634scaffolds with N50of 322,982bp.The lengths of scaffolds ranged from1,004bp to 1,953,398bp.Finally,we got a39.9Mb draft genome sequence for G.lucidum.Although we could not assemble these scaffolds into chromosomes,by using k-mer analysis,the expected genome size was42.53Mb,so these scaffolds covered93.92%of the whole genome.The G+C content of the G.lucidum genome was55.56%. The features of the assembled genome sequences are shown in Table1.Repeat sequences in the genomeFive softwares were used to characterize transposons and the Tandem Repeat Finder was used to identify the tandem repeat sequences.Totally,we identified2,025,242bp repeat sequences, comprising 5.07%of the genome.No large scale dispersed segmental duplication was observed.Of them,tandem repeat sequences comprised0.57%and transposable elements(TEs)were about4.6%of the assembled genome.Among the TEs,long terminal repeats(LTR)and non-LTR transposons comprised 1.43%and3.17%of the genome,respectively.Among the non-LTR transposons,DNA transposons(class II transposons) comprised0.52%of the genome.The elements of DNA transposons mainly fell into four classes:Activator(hAT), Enhancer(En/spm),Harbinger and Mariner(Tc1). Predicted Gene modelsBy combining several different gene predictors(see methods), we identified12,080protein-coding gene models,245tRNA,1 rRNA and15snRNA with a total length of17,343,729bp, accounting for43.41%of the genome(Table1).The gene density was3.34genes/10kilobases(kb)and the average size of protein coding genes was1,435bp.Genes were typically with small exons (average230bp)and introns(average100bp),which were similar with other basidiomycetes[20].There were average6.25exons in one gene.Notably,the G+C content in protein coding gene regions was58.86%,slightly higher than the whole genome (55.56%)and other basidiomycetes[20].Among the245tRNA genes,10tRNAs were pseudogenes and 141tRNAs contained an intron.Forty six out of the61possible anti-codon tRNA were found,corresponding to the codons of20 amino acids.The anti-codon usage and codon usage were shown in File S1.Except for several codons,the usage frequencies of most codons were proportional to the numbers of anti-codon proportion (File S2).For lacking of the other18anti-codons,we speculated anticodon repertoire in this genome was consistent with the normal wobble rules[23],which allow the following anticodon Table1.The characteristics of assembly scaffold and genome of G.lucidum.Scaffold characteristicsTotal number634Total length(bp)39,945,170N50(bp)322,982N90(bp)50,570Max length(bp)1,953,398Min length(bp)1,004Genome characteristicsGenome assembly(Mb)39.9Whole GC content(%)55.56Coding sequence GC content(%)58.86Number of protein-coding genes12080Coding sequence.=100amino acids11522Coding sequences/genome43.31%Average gene length(bp)1959Average coding sequence length(bp)1435Average exon length(nt)230Average intron length(nt)100Average number of exons per gene 6.25doi:10.1371/journal.pone.0036146.t001and codon pairings:I/ANN:NNU/NNC;GNN:NNU/NNC; UNN:NNA and CNN:NNG.Gene annotationBy homology search,we mapped our predicted proteins to Gene Ontology(GO),5,893(49%)of which were assigned to GO terms,including5,410,1,738and4,034genes mapped to the molecular function,cellular component and biological process categories,respectively.We also assigned4,737proteins to the Kyoto Encyclopedia of Genes and Genomes(KEGG)database. The annotations with KEGG,GO,InterPro,NCBI Clusters of Orthologous Groups of proteins(COG),NCBI non-redundant (nr),Pfam,SwissProt and TrEMBL protein databases were shown in File S1.KEGG function classification was shown in Figure1,in which‘‘Carbohydrate Metabolism’’,‘‘Xenobiotics Biodegradation and Metabolism’’and‘‘Amino Acid Metabolism’’were the top3 categories.Of these predicted genes in G.lucidum,up to9,978, 6,436and9,981showed a significant similarity(BLASTP,cut-off e-value,1e-7)to documented proteins in the NCBI nr database (Aug2011),Swiss-Prot,and TrEMBL,respectively.As a result, about83%of predicted proteins were similar to sequences in these public databases and only2,094genes were not similar to current public sequences,some of which might be G.lucidum specific genes. We further classed predicted genes into orthologous group(single-copy in G.lucidum and at least one other species ortholog),or paralogous group(multi-copy in G.lucidum).There were4,689 orthologous genes and5,510paralogous genes by above definition. By the NCBI COG mapping,3,509(29%)proteins were assigned to COGs proteins(Figure2).Similar to the KEGG annotation, some metabolisms and biosynthesis categories in COG were highly enriched.Comparing with other published basidiomycetes,G.lucidum has many different biological characteristics,such as saprophytism, multiple triterpenoids and polysaccharides metabolites.To com-pare and find genes for G.lucidum specific characteristics,we performed comprehensive comparisons among G.lucidum and other published fungi genomes in the follow sections. Comparative genomics analysis of KEGG annotationTo make comprehensive comparison for KEGG annotations in fungi,KEGG pathway mapping was performed in13basidiomy-cetes and5ascomycetes(File S1).To facilitate comparison and show results in one table,we only showed the results of8 basidiomycetes(7in agaricomycotina and1in ustilaginomycotina) and2represented ascomycetes in the following analyses.In the second layer of KEGG pathway terms,we found fungi in agaricomycotina(including G.lucidum)had much more genes in each pathway than other fungi(Figure1and File S1).G.lucidum had relatively more genes in several pathways of metabolism and biosynthesis,such as‘‘Metabolism of Terpenoids and Polyketides’’,‘‘Metabolism of Other Amino Acids’’and‘‘Xenobiotics Biodeg-radation and Metabolism’’.In the third layer of KEGG,under the ‘‘Xenobiotics Biodegradation and Metabolism’’pathway category, we found that G.lucidum and other Agaricomycotina fungi had relatively more proteins involved in several degradation pathways including pathways of aminobenzoate,bisphenol,dioxin and polycyclic aromatic hydrocarbon degradation(Table2).There were about190genes involved in3of these degradation pathways in G.lucidum(Table2).These results indicated that G.lucidum had strong ability of degradation.In addition,we also observed that ‘‘Metabolism of xenobiotics by cytochrome P450’’and‘‘Drug metabolism-cytochrome P450’’sub-pathways had relative more genes in G.lucidum.In the fourth layer,G.lucidum had27KO terms with1.5-fold genes more than other Agaricomycotina fungi(Table3).Of them, the term K00490(cytochrome P450)showed relatively more genes in G.lucidum.Since cytochrome P450is a large group of enzymes involved in many important biosynthesis and metabolism path-ways,we further identified and performed comparison about P450 genes at genome level in these fungi.We found that the numbers of P450genes in agaricomycotina were much more than those in other subphylums of basidiomycota and in ascomycota(Table4).G.lucidum had222putative P450genes,which was the largest one in the10represented fungi and the top3in all the18fungi we analyzed.Under the‘‘Metabolism of xenobiotics by cytochrome P450’’pathway,we found the glutathione S-transferases(GST,EC 2.5.1.18),a kind of well-known detoxification enzymes[24],were greatly enriched in G.lucidum compared with other fungi. According to the classification of Morel et al.[25],we investigated the GSTs distribution in six known classes(GTT1,GTT2, URE2p,Omega,EFB c,MAK16)and a new class(GTE)of all fungi in this study.Under the relatively strict cutoff(BLASTP e-value,1e-10and identity.30%),we found39GST genes in G. lucidum,which was the highest GST gene numbers among all fungi we analysed.Notably,G.lucidum had18genes in the Omega subfamily,which were much more than other fungi(Table4). The pathway of triterpenes synthesisThe triterpenes have been reported of great importance in G. lucidum because of their significant roles in immune regulation and other biological activities[4–7].In plants,there are two pathways to synthesize terpenoids:the Mevalonate(MVA)pathway and methylerythritol4-phosphate/deoxyxylulose5-phosphate(MEP/ DOXP)pathway.It has been suggested that the MEP/DOXP pathway do not exist in fungi[8].We checked the G.lucidum genes in the‘‘terpenoid backbone biosynthesis(map00900)’’pathway and found that the genes only distributed in MVA pathway,no gene existed on the MEP/DOXP pathway(File S2).The similar results were found in other basidiomycetes and ascomycetes. These observations verified that terpenoid backbone biosynthesis only could be through the MVA pathway in fungi at the genome level.By integrating MVA pathway in KEGG and plant triterpenoid saponins biosynthesis from literatures,we summarized the potential triterpenoids biosynthesis pathway in G.lucidum (Figure3).The pathway contained14steps catalyzed by different enzymes.The first11steps are the common steps for terpenoid skeleton biosynthesis and the last3steps may be specific for different triterpenes in different species.We identified and summarized the enzymes in the ganoderic acids(GAs)biosynthesis in table5from the G.lucidum genome,which includes6putative UDP-glycosyltransferases(UGTs)genes(Table5). Interestingly,we observed a fusion gene in the triterpenes biosynthesis pathway in12Basidiomycete fungi except for L. bicolor.The N-terminal of the protein was similar to the enzyme K01760(KEGG ID,cystathionine beta-lyase,metC),while the C-terminal was similar to another enzyme K00869(KEGG ID, mevalonate kinase,MVK)which is an enzyme in the triterpenes biosynthesis pathway(Figure4A).These proteins were referred as metC-MVKs in the following.In all other species except basidiomycetes,no such a protein matched the two enzymes at the same time.By multiple sequence alignment of the fusion protein in basidiomycetes,the average length of metC-MVKs was ,886aa and about half of it matched with K01760and half matched with K00869.The metC-MVK protein was the only homologous protein with K00869in our analyzed basidiomycetes,so they should be involved in terpenoid backbone biosynthesis functioning as K00869in other species.We also noticed that this metC-MVK gene was the only gene which best hit K01760.In addition,seven of the 12Basidiomycota fungi had a 16amino acids conserved insertion sequence in the middle of the MVK regions of the metC-MVK gene (Figure 4B).Phylogeny of G.lucidum and multigene familiesThe phylogenetic tree constructed by concatenated sequences alignments showed that G.lucidum was close to another polyporale fungus Fomitopsis pinicola in the evolutionary relationship among all our analyzed fungi (Figure 5A).In the all-to-all BLASTP analysis,9,278predicted proteins of G.lucidum showed high sequence similarity with that of F.pinicola (BLASTP,cut-off e-value ,1e-7).Following,9,013and 8,872predicted proteins showed significantsequence similarity to that of Gloeophyllum trabeum and Stereum hirsutum ,which were all in polyporales.In order to investigate the gene family expansion in G.lucidum ,we performed analyses for multi-gene families,which were generated from proteins in 8Agaricomycotina species.In total,10,720gene families (File S1)containing at least two members were generated using the Tribe-MCL tool,of which 5,947families had at least one G.lucidum gene and 1,487families had at least two G.lucidum genes.The largest gene family had 517genes and 126of them were G.lucidum genes.In 3,540lineage specific gene families,287families were G.lucidum specific (File S1).The number was very similar to that of G.trabeum and much lower than other basidiomycetes.L.bicolor had the largest (947)lineage-specific gene families,which may be related to its biggest genome sizeamongFigure 1.The KEGG function annotaion of G.lucidum .Distribution of Genes in different KEGG categories.doi:10.1371/journal.pone.0036146.g001our analyzed basidiomycetes.The distributions of genes with different copies or species-specific are shown in Figure 5B.Besides the lineage specific gene families,the evolutionary changes in the size of each gene family were performed using CAFE program.As a result,we found that among the 7,180non-lineage specific gene families for G.lucidum ,636of them were expanded and 994of them had undergone contraction.The function of the most abundant gene family was uncharacterized for lack of available annotation,while genes in the second most abundant gene family encoding proteins with a P450domain (File S1).The expanded and contracted gene families and their annotations were shown in File S1.G.lucidum has multiple copy het -like genesAmong the 287G.lucidum specific gene families (File S1),the largest G.lucidum specific gene family had 101genes and 89of them had the HET (heterokaryon incompatibility protein)domain,which is related to vegetative (or heterokaryon)incompatibility (VI).It is surprised that so many het -like genes were found in G.lucidum ,while few het genes were reported in other fungi.In PFAM database,there are three vegetative incompatibility related domains,which are HET ,Het-c and HET-s .Since the HET related studies were mostly reported in fungi P.anserina and N.crassa ,we added them in our analyzed fungi list to identify the HET genes.Thus,in total,we scanned 7ascomycetes and 13basidiomycetes for genes with the three het related domains.The results were shown in File S1and only P.anserina had one HET-s domain.The number of Het-c genes in each species was always 0–2and the highest one was four.While the number of genes with HET domain varied from 0to 126.It seems that het-c and HET-s are comparatively conserved.G.lucidum had two genes with Het-c domain and 96genes with HET domain,which was much more than other basidiomycetes and most ascomycetes.In the comparison,we also observed that there were 62and 126HET -like genes in N.crassa and P.anserine in which the number ofhetFigure 2.The COG function annotaion of G.lucidum .Distribution of Genes in different COG function classification.doi:10.1371/journal.pone.0036146.g002genes were reported for11and9,respectively[26].Thus,some of the het-like genes may play roles in other function not for VI,such as mat a/A for mating in N.crassa and het c for ascospore formation in P.anserina[26,27].Therefore,it may be a complex system not only one locus affect the VI.Since one of het-c loci in P.anserina is similar to the glycolipid transfer protein(GLTP)[27],the GLTP domain was also scanned in this study.We found two genes G_lucidum_10005152and G_lucidum_10009654with a GLTP domain,which also might be het-c genes.These HET genes in G.lucidum encoded proteins with an average length of2,686amino acids and did not uniformly spread across the genome.The98genes were located on45scaffolds (total634scaffolds).Of them,13scaffolds had more than two HET genes and three scaffolds had more than10HET genes,suggesting the expansion of HET genes might have undergone tandem duplications.Except for the HET domain,some HET genes also had other domains,such as,adh_short,Aldo_ket_red, ICMT,Nup96,p450,SUR7,and WD40.Function annotation of putative CAZymesCAZy is a carbohydrate-active enzymes(CAZymes)database (/)[28],which classifies the CAZymes into5 major modules:Glycoside Hydrolases(GH),Glycosyl Transferases (GT),Polysaccharide Lyases(PL),Carbohydrate Esterases(CE), and Carbohydrate-Binding Modules(CBM).We mapped our analyzed fungi genomes to CAZy to study the members and features of these Carbohydrate-active enzymes.The results revealed that the gene numbers in the5major modules of CAZymes were similar in Agaricomycotina fungi,while much fewer in Ustilaginomycotina and Ascomycota fungi.G.lucidum possessed a wide spectrum of CAZymes responsible for the biosynthesis,degradation and modification of oligo-and polysac-charides,and of glycoconjugates(Table6).The GHs and CEs in G.lucidum showed a little more than average count,while GTs, CBMs and PLs showed less than the Agaricomycotina average (Table6).Function annotation of putative FOLymesTo assess the degradation in genomic level,proteins of G. lucidum were aligned to proteins in the FOLy(Fungal Oxidative Lignin enzymes)database,which collects and classifies enzymes involved in lignin catabolism.The FOLymes mainly comprise two families,lignin oxidases(LO families)and lignin-degrading auxiliary enzymes(LDA families)that generate H2O2for peroxidases.G.lucidum contained a total of48members in FOLymes(24genes in LO families and24genes in LDA families, Table7)which was more than brown-rot fungi F.pinicola,G.trabeum and the fungi without ligninolytic activity,such as Malassezia globosa,Pyrenophora teres and Saccharomyces cerevisiae.In contrast,it had fewer FOLymes than the coprophilic fungus Coprinopsis cinerea (59FOLymes)and the white-rot fungus Pleurotus ostreatus(72 FOLymes).While G.lucidum had the largest number of lignin oxidases(LO families).The LO families can further divided into3 subfamilies,which are laccases(LO1),lignin peroxidases,manga-Table2.The gene distribution of fungi in pathway‘‘Xenobiotics Biodegradation and Metabolism’’.Pathway inKEGG Pathway annotation AgaricomycotinaG.luc F.pin P.chr P.ost L.bic C.cin M.glo P.tri S.cer 00362Benzoate degradation3347293130212677316 00627Aminobenzoate degradation190*178157149189831432120238 00364Fluorobenzoate degradation6444414181 00625Chloroalkane and chloroalkene degradation718578108674339147927 00361Chlorocyclohexane and chlorobenzene degradation25221426254124402 00623Toluene degradation1314812154124281 00622Xylene degradation1101111000 00633Nitrotoluene degradation0010131110 00642Ethylbenzene degradation1123111288113419 00643Styrene degradation14111489573234 00791Atrazine degradation3331211071 00930Caprolactam degradation141281223985255 003511,1,1-Trichloro-2,2-bis(4-chlorophenyl)ethane(DDT)degradation108513900090 00363Bisphenol degradation196*183165194186841291816227 00621Dioxin degradation35*212724129120190 00626Naphthalene degradation62645758413235119417 00624Polycyclic aromatic hydrocarbon degradation187*1481511361505999111328 00980Metabolism of xenobiotics by cytochrome P45044*4145303522315389 00982Drug metabolism-cytochrome P45038*31322526272663210 00983Drug metabolism-other enzymes1414121416152061712*represents G.lucidum having relatively more genes than others.Abbreviations:G.lui,Ganoderma lucidum;F.pin:Fomitopsis pinicola;P.chr:Phanerochaete chrysosporium;:Schizophyllum commune;P.ost:Pleurotus ostreatus;L.bic:Laccaria bicolor;C.cin:Coprinopsis cinerea;M.glo:Malassezia globosa;P.tri:Pyrenophora teres;S.cer:Saccharomyces cerevisiae.doi:10.1371/journal.pone.0036146.t002Table3.KO families showing relatively more genes in G.lucidum genome as compared to other Basidiomycota fungi.Pathway in KEGG KO description G.luc F.pin P.chr P.ost L.bic C.cin K00490CYP4F;cytochrome P450,family4,subfamily F48411220142016K00480E1.14.13.1;salicylate hydroxylase3420272311811K01046E3.1.1.3;triacylglycerol lipase26211410191413K01279TPP1,CLN2;tripeptidyl-peptidase I2431105772K04125E1.14.11.13;gibberellin2-oxidase2217441011K10866RAD50;DNA repair protein RAD5022142112768K01183E3.2.1.14;chitinase2115101312119K01423E3.4.-.-;198172612512181451121K00140malonate-semialdehyde dehydrogenase/methylmalonate-semialdehyde dehydrogenaseK01528DNM;dynamin GTPase11434433K00218E1.3.1.33;protochlorophyllide reductase8642235K01190lacZ;beta-galactosidase8234500K03942NDUFV1;NADH dehydrogenase(ubiquinone)flavoprotein17212222K06148ABCC-BAC;ATP-binding cassette,subfamily C,bacterial7253246K01044E3.1.1.1;carboxylesterase6024435K02831RAD53;ser/thr/tyr protein kinase RAD536110221K00119E1.1.99.-;5021203K00129E1.2.1.5;aldehyde dehydrogenase(NAD(P)+)5222201K01082E3.1.3.7;39(29),59-bisphosphate nucleotidase5314111K09202regulatory protein SWI55211321K09553STIP1;stress-induced-phosphoprotein15011015K00135E1.2.1.16;succinate-semialdehyde dehydrogenase(NADP+)4111111K01539ATP1A;sodium/potassium-transporting ATPase subunit alpha4410120K02133ATPeF1B,ATP5B;F-type H+-transporting ATPase subunit beta4111111K10590TRIP12;E3ubiquitin-protein ligase TRIP124111111K12388SORT1;sortilin4111121K09753CCR;cinnamoyl-CoA reductase3010002The abbreviations of species were the same with Table2.The number of genes in G.lucidum of each KO of is1.5fold more than the average of the other Basidiomycota fungi.doi:10.1371/journal.pone.0036146.t003Table4.The gene distribution of fungi in P450family and GST family.G.luc F.pin P.chr P.ost L.bic C.cin M.glo P.tri S.cerP450222*19615412016011314312976GSTEFBy1111121213GTE410597314020GTT11102102011GTT284363116001MAK161111101111omega18*788835143URE2p6890112121TOTAL39*32272722203151110*represents G.lucidum having the most genes than others.The abbreviations of species were the same with Table2.doi:10.1371/journal.pone.0036146.t004。