QTL-seq流程说明文档
- 格式:docx
- 大小:44.54 KB
- 文档页数:9
建立血小板转录组测序流程下载温馨提示:该文档是我店铺精心编制而成,希望大家下载以后,能够帮助大家解决实际的问题。
文档下载后可定制随意修改,请根据实际需要进行相应的调整和使用,谢谢!并且,本店铺为大家提供各种各样类型的实用资料,如教育随笔、日记赏析、句子摘抄、古诗大全、经典美文、话题作文、工作总结、词语解析、文案摘录、其他资料等等,如想了解不同资料格式和写法,敬请关注!Download tips: This document is carefully compiled by theeditor. I hope that after you download them,they can help yousolve practical problems. The document can be customized andmodified after downloading,please adjust and use it according toactual needs, thank you!In addition, our shop provides you with various types ofpractical materials,such as educational essays, diaryappreciation,sentence excerpts,ancient poems,classic articles,topic composition,work summary,word parsing,copy excerpts,other materials and so on,want to know different data formats andwriting methods,please pay attention!1. 样本采集:从健康志愿者或患者中采集适量的血液样本。
使用抗凝剂(如 EDTA)防止血液凝固。
数量性状的分子标记(QTL定位的原理和方法讲义)作物中大多数重要的农艺性状和经济性状如产量、品质、生育期、抗逆性等都是数量性状。
与质量性状不同,数量性状受多基因控制,遗传基础复杂,且易受环境影响,表现为连续变异,表现型与基因型之间没有明确的对应关系。
因此,对数量性状的遗传研究十分困难。
长期以来,只能借助于数理统计的手段,将控制数量性状的多基因系统作为一个整体来研究,用平均值和方差来反映数量性状的遗传特征,无法了解单个基因的位置和效应。
这种状况制约了人们在育种中对数量性状的遗传操纵能力。
分子标记技术的出现,为深入研究数量性状的遗传基础提供了可能。
控制数量性状的基因在基因组中的位置称为数量性状基因座(QTL)。
利用分子标记进行遗传连锁分析,可以检测出QTL,即QTL定位(QTL mapping)。
借助与QTL连锁的分子标记,就能够在育种中对有关的QTL的遗传动态进行跟踪,从而大大增强人们对数量性状的遗传操纵能力,提高育种中对数量性状优良基因型选择的准确性和预见性。
因此,QTL定位是一项十分重要的基础研究工作。
1988年,Paterson等发表了第一篇应用RFLP连锁图在番茄中定位QTL的论文。
之后,随着分子标记技术的不断发展以及许多物种中分子连锁图谱的相继建成,全世界出现了研究QTL的热潮,每年发表有关QTL 研究的论文数量几乎呈指数增长(图5.1),显示了该研究领域的勃勃生机。
目前,QTL定位研究已在许多重要作物中展开,并且进展迅速。
本章主要介绍QTL定位的原理和方法。
图5.11986~1998年期间国际上每年发表有关QTL研究的论文的数量. 数据从英国BIDS信息系统检索得到第一节数量性状基因的初级定位QTL定位就是检测分子标记(下面将简称为标记)与QTL间的连锁关系,同时还可估计QTL的效应。
QTL定位研究常用的群体有F2、BC、RI和DH。
这些群体可称为初级群体(primary population)。
DOI: 10.3724/SP.J.1006.2022.14025甘蓝型油菜白花基因InDel连锁标记开发王瑞1,2陈雪1,2郭青青1,2周蓉1,2陈蕾1,2李加纳1,2,*1西南大学农学与生物科技学院,重庆400715;2 重庆市油菜工程技术研究中心,重庆400715摘要:碱基插入/缺失(InDel)是基因组上广泛分布的遗传变异形式。
但甘蓝型油菜白花基因InDel连锁标记还未见有关研究报道。
本研究以甘蓝型油菜双单倍体(doubled haploid, DH)纯系黄花Y05和甘蓝型油菜纯系白花W01杂交构建F2群体。
在F2群体中选取30株极端白花和30株极端纯黄花构建叶片DNA子代池,对亲本和DNA子代池进行30×重测序。
以法国甘蓝型油菜Darmor-bzh为参考序列,QTL-seq流程和PoPoolation2流程相互结合鉴定白花基因候选区间,2种方法均将白花基因定位于法国甘蓝型油菜Darmor-bzh C03染色体52~54 Mb区间。
利用IGV软件可视化白花基因候选区间插入缺失(InDel)变异位点,依据候选区间序列信息设计InDel引物,聚丙烯酰胺凝胶电泳筛选到8个与白花基因连锁共分离的InDel标记。
上述研究为甘蓝型油菜白花基因精细定位和分子标记辅助选育以及白花基因功能标记开发奠定了研究基础和工作思路。
关键词:甘蓝型油菜;重测序;白花基因;InDel标记;Development of linkage InDel markers of the white petal gene based on whole-genome re-sequencing data in Brassica napus L.WANG Rui1,2, CHEN Xue1,2, GUO Qing-Qing1,2, ZHOU Rong1,2, CHEN Lei1,2, and LI Jia-Na1,2,*1College of Agronomy and Biotechnology, Southwest University, Chongqing 400715, China; 2Chongqing Engineering Research Center for Rapeseed, Chongqing 400715, ChinaAbstract: InDel is widely distributed across the genome and occurs in a high density and large numbers in a genome. To date, the researches about linkage InDel markers of the white petal gene in B. napus L are very less yet. In this study, we constructed the F2 mapping population from the cross between DH Y05 (yellow petal) and DH W01 (white petal). Two bulks with 30 yellow petal lines and 30 white petal lines of F2 population were constructed by mixing an equal amount of DNA. Then two bulks and parents were performed 30× whole-genome re-sequencing. Darmor-bzh as the reference genome was aligned to sequence data from the two bulks and parents. QTL-seq and PoPoolation2 workflow were applied to identify the candidate region of the white petal gene. A major candidate region was identified on chromosome C03 (52–54 Mb) of Darmor-bzh. The insertion-deletion (InDel) sites can be visualized in candidate interval by Integrative Genomics Viewer (IGV). Based on these Indel variations, we used Vector and Blast to design InDel primers. Eight InDel markers closely linked to the white petal gene were screened by Polyacrylamide gel electrophoresis (PAGE). In summary, these results provide a basis for fine mapping white petal gene and InDel molecular marker located on functional genes as well as molecular marker assisted selection breeding.Keywords: Brassica napus L.; re-sequencing; white petal genes; InDel markers甘蓝型油菜(Brassica napus,AACC)属十字花科(Cruciferace)芸薹属(Brassica),是由白菜型油菜(Brassica rapa,AA)和甘本研究由高等学校学科创新引智计划(111计划)项目(B12006)资助。
QTL IciMapping3.0 定位简单应用教程张茜中国农科院2012.6.14主要步骤•数据准备•新建project•导入数据•构建图谱•QTL定位准备数据•.map格式将txt格式后缀名改成.map即可(表头信息不能动),一个map文件中包括General Information、Marker Types 、Information for Chromosomes and Markers三部分信息主要更改数据:7为F2群体;1一般不动;Marker space type 选1或2均可,只要保持数据对应Maker Types带型统计方法这些数据是标记在第几条染色体(group)上,未构建图谱侧全为0点File 选New Project新建一个工作项命名保存路径点File 选*map导入构建准备好的map格式图谱的数据打开,完成数据导入点击分组,在此处出现group群点可以看到一个group下所含标记,右键点击一个标记可以对其位置调动或者删除完成分组后,点击ordering,转换成染色体组再点此按钮完成沟通准备工作,工具栏上的map图标变蓝可以点击构图了点击map 按钮出现图谱(右)点击即可出现下一个染色体图谱点击出现整体图谱Save 可以保存各种格式的图QTL定位数据准备将构图所得结果F2bip(在project-map-result文件下)先复制一份,再用txt打开方式打开所复制文件。
Bip文件中包含5部分General Information、Information for Chromosomes andMarkers、Linkage map (Marker namefollowed by position or the interval length)、 Marker Type 、Phenotypic Data更改数据:0改成1选File-open file-*bip打开更改保存好的bip格式文件选ICIM-ADD添加的下框(一般都默认),此时start按钮从灰色变黑色,单击即可进行定位Start 完了点ADD即出现下图加性效应图显性效应图总染色体添加lod值线下一个染色体在Graph 下可以选择连锁图和lod (上)或者连锁图和QTL(下)图谱结果信息在map目录下QTL结果信息在BIP目录下信息栏补充•QTL ICIMapping是在*map(oppen file子菜单下)下完成构件图谱,在*bip(oppen file子菜单下)下完成QTL定位。
全基因组重测序是对已知基因组序列的物种进行不同个体的基因组测序,并在此基础上对个体或群体进行差异性分析。
基于全基因组重测序技术,人们可以快速进行资源普查筛选,寻找到大量遗传变异,实现遗传进化分析及重要性状候选基因的预测。
随着测序成本降低和拥有参考基因组序列物种增多,全基因组重测序成为动植物育种和群体进化研究迅速有效的方法。
简化基因组测序技术是对与限制性核酸内切酶识别位点相关的DNA进行高通量测序。
RAD-seq(Restriction-site Associated DNA Sequence)和GBS(Genotyping-by-Sequencing)技术是目前应用最为广泛的简化基因组技术,可大幅降低基因组的复杂度,操作简便,同时不受参考基因组的限制,可快速鉴定出高密度的SNP位点,从而实现遗传进化分析及重要性状候选基因的预测。
简化基因组技术尤其适合于大样本量的研究,可以为利用全基因组重测序技术做深度信息挖掘奠定坚实的基础。
全基因组重测序和简化基因组测序技术可广泛应用于变异检测、遗传图谱构建、功能基因挖掘、群体进化等研究,具有重大的科研和产业价值。
产品脉络图动植物重测序建库测序单个性状家系群体自然群体SNP/InDel/SV/CNV/转座子基因组DNA有效SNP性状定位群体进化群体进化(基于简化基因组测序) 群体进化(基于全基因组重测序) 变异检测(基于简化基因组测序)SNP检测/SSR检测遗传图谱全基因组关联分析(GWAS)功能基因挖掘变异检测(基于全基因组重测序) QTL定位BSA性状定位多个性状动植物重测序动植物重测序概述SNP检测、注释及统计基因组DNA350 bp小片段文库HiSeq PE150测序数据质控与参考基因组比对利用全基因组重测序技术对某一物种个体或群体的基因组进行测序及差异分析,可获得SNP、InDel、SV、CNV、PAV、转座子等大量的遗传多态性信息,建立遗传多态性数据库,为后续揭示进化关系、功能基因挖掘等奠定基础。
山东农业科学 2022,54(1):152~156ShandongAgriculturalSciences DOI:10.14083/j.issn.1001-4942.2022.01.023收稿日期:2021-08-26基金项目:山东省现代农业产业技术体系花生创新团队建设项目(SDAIT-04-03);山东省农业良种工程项目(2020LZGC001);山东省重点研发计划项目(2019GNC106002)作者简介:吴紫萱(2001—),女,江苏镇江人,在读本科生,植物科学与技术专业。
E-mail:1258141465@qq.com通信作者:刘风珍(1966—),女,山东东阿人,博士生导师,从事花生遗传育种研究。
E-mail:liufz@sdau.edu.cn花生种皮颜色研究进展吴紫萱1,薛其勤1,2,杨会1,刘风珍1(1.山东农业大学农学院/作物生物学国家重点实验室,山东泰安 271018;2.潍坊科技学院,山东寿光 262700) 摘要:花生(ArachishypogaeaL.)种皮由珠被发育而来,分三层,外表皮是一层厚壁细胞,中间层为若干层薄壁细胞,内表皮为一层薄壁细胞。
种皮色素物质主要分布在1~2层表皮细胞内。
种皮颜色是决定花生商品和保健价值的重要性状。
本文主要对花生种皮颜色类型、不同颜色种皮的营养功效、种皮发育进程中的色泽变化和色素沉积、种皮颜色的遗传以及相关基因定位等方面研究进展进行综述,并对其未来研究进行了展望。
关键词:花生;种皮颜色;色素中图分类号:S565.2 文献标识号:A 文章编号:1001-4942(2022)01-0152-05ResearchProgressonTestaColorofPeanutWuZixuan1,XueQiqin1,2,YangHui1,LiuFengzhen1(1.CollegeofAgronomy,ShandongAgriculturalUniversity/StateKeyLaboratoryofCropBiology,Taian271018,China;2.WeifangUniversityofScienceandTechnology,Shouguang262700,China)Abstract Peanut(ArachishypogaeaL.)testaisdevelopedfromthepearl,whichisdividedintothreelayers.Theouterepidermisisalayerofthick walledcells,themiddlelayerisanumberofthin walledcells,andtheinnerepidermisisalayerofthin walledcells.Thepigmentsubstancesinthetestaaremainlydistribu tedinthe1~2layersofepidermalcells.Testacolorisanimportanttraitthatdeterminesthecommodityandhealthvalueofpeanut.Theresearchprogressesofpeanuttestacolortypes,nutritionalefficacyofdifferentcol orfultesta,colorchangesandpigmentdepositionduringtestadevelopment,inheritanceandgeneticmappingoftestacolorweresummerizedinthispaper,andthefutureresearchwerealsoprospected.Keywords Peanut;Testacolor;Pigment 花生又名长生果,属豆科植物。
Package‘pedigree’October14,2022Type PackageTitle Pedigree FunctionsVersion1.4.2Date2022-08-13Depends MatrixImports methods,HaploSim(>=1.8.4),reshapeDescription Pedigree related functions.License GPL(>=2)Author Albart Coster[aut,cre]Maintainer Albart Coster<**********************>NeedsCompilation yesRepository CRANDate/Publication2022-08-1320:50:02UTCR topics documented:pedigree-package (2)add.Inds (2)blup (3)calcG (4)calcInbreeding (4)countGen (5)countOff (6)gblup (6)makeA (8)makeAinv (8)orderPed (9)trimPed (10)Index1112add.Inds pedigree-package Package to deal with pedigree dataDescriptionPackage with functions to analyse and transform pedigree data.A pedigree is a data.frame where thefirst column contains an ID,and the second and third columns contain ID offirst and second parent.Author(s)Albart Coster:<********************>See AlsotrimPed orderPed countGen makeA makeAinv calcInbreeding add.Indsadd.Inds Function to add missing individuals to a pedigreeDescriptionFunction add.Inds()adds missing individuals to a pedigree and returns the complete pedigree as a data.frame with the same headers as the original pedigree.Remeber to check for errors beforehand with function errors.ped.Unknown parents should be coded as NA.Usageadd.Inds(ped)Argumentsped data.frame with three columns:id,id parent1,id parent2Valuedata.frame of three columns with identical header as input.Author(s)Albart Coster,********************See AlsoorderPedblup3ExamplesID<-3:5DAM<-c(1,1,3)SIRE<-c(2,2,4)pedigree<-data.frame(ID,DAM,SIRE)pedigree<-add.Inds(pedigree)blup Function to calculate breeding values using an animal model DescriptionFit an animal model to data,use a given variance ratio(α=σ2eσ2a ).Calculate inverse of the additivegenetic relationship matrix using function makeInv()of this package.Usageblup(formula,ped,alpha,trim=FALSE)Argumentsformula formula of the model,do not include the random effect due to animal(generally ID).ped data.frame with columns corresponding to ID,SIRE,DAM and the columns in the formula.alpha Variance ratio(σ2eσ2a).trim If TRUE,trims the pedigree using the available phenotype data using function trimPed.ValueVector of solutions to the model,including random animal effects.See AlsoSamplePedigree,gblup,makeAinv,blupExamplesexample(gblup)sol<-blup(P~1,ped=ped,alpha=1/h2-1)4calcInbreedingcalcG Function to calculate a relationship matrix from marker data(usuallyallele count data),G matrix.DescriptionFunction to calculate a relationship matrix from marker data.Option to return the inverse of matrix.Inverse calculated using Matrix package.UsagecalcG(M,data=NULL,solve=FALSE)ArgumentsM Matrix of marker genotypes,usually the count of one of the two SNP alleles at each markers(0,1,or2).data Optional logical vector which can tell of which individuals we have phenotypes.solve Logic,if TRUE then function returns the inverse of the relationship matrix. ValueMatrix of class dgeMatrix.See AlsoSamplePedigree,gblup,makeAinv,blupExamplesexample(gblup)G<-calcG(M)Ginv<-calcG(M,solve=TRUE)calcInbreeding Calculates inbreeding coefficients for individuals in a pedigree.DescriptionCalculates inbreeding coefficients of individuals in a pedigree.UsagecalcInbreeding(ped)countGen5 Argumentsped data.frame with three columns:id,id parent1,id parent2ValueLogical.Examplesid<-1:6dam<-c(0,0,1,1,4,4)sire<-c(0,0,2,2,3,5)ped<-data.frame(id,dam,sire)(F<-calcInbreeding(ped))countGen Count generation number for each individual in a pedigree.DescriptionCounts generation number for individuals in a pedigreee.UsagecountGen(ped)Argumentsped data.frame with three columns:id,id parent1,id parent2ValueNumeric vectorExamplesid<-1:5dam<-c(0,0,1,1,4)sire<-c(0,0,2,2,3)ped<-data.frame(id,dam,sire)(gens<-countGen(ped))countOff Function that counts the number of offspring(and following genera-tions for each individual in a pedigree.DescriptionFunction to count the number of offspring for each individual in a pedigree.With loops,offspring of later generations will be counted several times.UsagecountOff(ped)Argumentsped data.frame with three columns:id,id parent1,id parent2ValueNumeric vector with number of offspring for each individual in the pedigree.Author(s)Albart CosterExamplesexample(countGen)countOff(ped)gblup Function to calculate breeding values using an animal model and arelationship matrix calculated from the markers(G matrix) DescriptionFit an animal model to data,use a given variance ratio(α=σ2eσ2a ).Calculate genetic relationshipmatrix using the function calcG of this package.Usagegblup(formula,data,M,lambda)Argumentsformula formula of the model,do not include the random effect due to animal(generallyID).data data.frame with columns corresponding to ID and the columns mentioned inthe formula.M Matrix of marker genotypes,usually the count of one of the two SNP alleles ateach markers(0,1,or2).lambda Variance ratio(σ2e)σ2aValueVector of solutions to the model,including random animal effects.See AlsoSamplePedigree,gblup,makeAinv,blupExamples##Example Code from SampleHaplotypeshList<-HaploSim::SampleHaplotypes(nHaplotypes=20,genDist=1,nDec=3,nLoc=20)##create objectsh<-HaploSim::SampleHaplotype(H0=hList[[1]],H1=hList[[2]],genDist=1,nDec=3)##code from the Example SamplePedigreeID<-1:10pID0<-c(rep(0,5),1,1,3,3,5)pID1<-c(rep(0,4),2,2,2,4,4,6)ped<-data.frame(ID,pID0,pID1)phList<-HaploSim::SamplePedigree(orig=hList,ped=ped)##own codeh2<-0.5ped<-phList$pedhList<-phList$hListqtlList<-HaploSim::ListQTL(hList=hList,frqtl=0.1,sigma2qtl=1)qtl<-tapply(unlist(qtlList),list(rep(names(qtlList),times=unlist(lapply(qtlList,length))), unlist(lapply(qtlList,function(x)seq(1,length(x))))),mean,na.rm=TRUE) qtl<-reshape::melt(qtl)names(qtl)<-c("POS","TRAIT","a")HH<-HaploSim::getAll(hList,translatePos=FALSE)rownames(HH)<-sapply(hList,function(x)x@hID)QQ<-HH[,match(qtl$POS,colnames(HH))]g<-QQped$G<-with(ped,g[match(hID0,rownames(g))]+g[match(hID1,rownames(g))])sigmae<-sqrt(var(ped$G)/h2-var(ped$G))ped$P<-ped$G+rnorm(nrow(ped),0,sigmae)M<-with(ped,HH[match(hID0,rownames(HH)),]+HH[match(hID1,rownames(HH)),])rownames(M)<-ped$ID8makeAinv sol<-gblup(P~1,data=ped[,c( ID , P )],M=M,lambda=1/h2-1)makeA Makes the A matrix for a part of a pedigreeDescriptionMakes the A matrix for a part of a pedigree and stores it in afile called A.txt.UsagemakeA(ped,which)Argumentsped data.frame with three columns:id,id parent1,id parent2which Logical vector specifying between which indiduals additive genetic relationship is required.Goes back through the whole pedigree but only for subset of indi-viduals.ValueLogical.Examplesid<-1:6dam<-c(0,0,1,1,4,4)sire<-c(0,0,2,2,3,5)ped<-data.frame(id,dam,sire)makeA(ped,which=c(rep(FALSE,4),rep(TRUE,2)))A<-read.table("A.txt")if(file.exists("A.txt"))file.remove("A.txt")makeAinv Makes inverted A matrix for a pedigreeDescriptionMakes inverted A matrix for a pedigree and stores it in afile called Ainv.txt.UsagemakeAinv(ped)orderPed9 Argumentsped data.frame with three columns:id,id parent1,id parent2ValueLogical.Examplesid<-1:6dam<-c(0,0,1,1,4,4)sire<-c(0,0,2,2,3,5)ped<-data.frame(id,dam,sire)makeAinv(ped)Ai<-read.table( Ainv.txt )nInd<-nrow(ped)Ainv<-matrix(0,nrow=nInd,ncol=nInd)Ainv[as.matrix(Ai[,1:2])]<-Ai[,3]dd<-diag(Ainv)Ainv<-Ainv+t(Ainv)diag(Ainv)<-ddif(file.exists("Ainv.txt"))file.remove("Ainv.txt")orderPed Orders a pedigreeDescriptionOrders a pedigree so that offspring follow parents.UsageorderPed(ped)Argumentsped data.frame with three columns:id,id parent1,id parent2Valuenumerical vector10trimPedExamplesid<-1:6dam<-c(0,0,1,1,4,4)sire<-c(0,0,2,2,3,5)pedigree<-data.frame(id,dam,sire)(ord<-orderPed(pedigree))pedigree<-pedigree[6:1,](ord<-orderPed(pedigree))pedigree<-pedigree[order(ord),]pwrong<-pedigreepwrong[1,2]<-pwrong[6,1]trimPed Function to trim a pedigree based on available dataDescriptionTrims a pedigree given a vector of data.Branches without data are trimmed off the pedigree. UsagetrimPed(ped,data,ngenback=NULL)Argumentsped data.frame with three columns:id,id parent1,id parent2data TRUE-FALSE vector.Specifies if data for an individual is available.ngenback Number of generations back.Specifies the number of generations to keep before the individuals with data.ValueLogical vector specifying if an individual should stay in the pedigree.Examplesid<-1:5dam<-c(0,0,1,1,4)sire<-c(0,0,2,2,3)data<-c(FALSE,FALSE,TRUE,FALSE,FALSE)ped<-data.frame(id,dam,sire)yn<-trimPed(ped,data)ped<-ped[yn,]Index∗utilitiesadd.Inds,2blup,3calcG,4calcInbreeding,4countGen,5countOff,6gblup,6makeA,8makeAinv,8orderPed,9pedigree-package,2trimPed,10add.Inds,2,2blup,3,3,4,7calcG,4calcInbreeding,2,4countGen,2,5countOff,6gblup,3,4,6,7makeA,2,8makeAinv,2–4,7,8orderPed,2,9pedigree(pedigree-package),2pedigree-package,2SamplePedigree,3,4,7trimPed,2,3,1011。
作物学报 ACTA AGRONOMICA SINICA 2021, 47(10): 2036 2044 / ISSN 0496-3490; CN 11-1809/S; CODEN TSHPA9 E-mail: zwxb301@本研究由中国博士后科学基金项目(2020M68299)和中国农业科学院科技创新工程协同创新任务(CAAS-XTCX2016001)资助。
This study was supported by the China Postdoctoral Science Foundation (2020M68299) and the Agricultural Science and Technology Inno-vation Program Cooperation and Innovation Mission of the Chinese Academy of Agricultural Sciences (CAAS-XTCX2016001).*通信作者(Corresponding author): 王雅美, E-mail: wangyamei@第一作者联系方式: E-mail: woshiyonghwa@Received (收稿日期): 2020-11-30; Accepted (接受日期): 2021-03-23; Published online (网络出版日期): 2021-04-07. URL: https:///kcms/detail/11.1809.S.20210407.0922.004.htmlDOI: 10.3724/SP.J.1006.2021.02082结合QTL-seq 和连锁分析发掘水稻中胚轴伸长相关QTL刘 畅1 孟 云1 刘金栋1 王雅美1,* Guoyou Ye 1,21中国农业科学院深圳农业基因组研究所, 广东深圳 518120; 2 国际水稻研究所, 菲律宾马尼拉 DAPO Box 7777摘 要: 中胚轴长度(mesocotyl length, ML)是影响旱直播水稻出苗和早期幼苗活力的重要性状。
目录一. RNA-Seq 服务流程 (3)1.1 RNA-Seq 技术的简要介绍 (3)1.2 RNA-Seq 数据分析流程 (3)二. RNA-Seq 数据分析结果 (5)2.1 测序结果原始数据 (5)2.1.1 数据格式 (5)2.1.2 结果路径 (5)2.2 测序数据的质量分析 (6)2.2.1 分析目的 (6)2.2.2 分析结果 (6)2.2.3 结果路径 (7)2.3 测序数据比对基因组 (8)2.3.1 分析目的 (8)2.3.2 分析结果 (9)2.3.3结果路径 (10)2.4 转录本的组装,差异表达基因 (11)2.4.1 分析目的 (11)2.4.2 分析结果 (11)2.4.3 结果路径 (12)2.5 差异基因功能分析 (13)2.5.1 分析结果 (13)2.5.2结果路径 (14)2.6 信号通路 (15)2.6.1 信号通路图 (15)2.6.2结果路径 (15)三、结果分析 (16)四、名词解释 (17)五、参考文献 (18)一. RNA-Seq 服务流程1.1 RNA-Seq 技术的简要介绍RNA-Seq技术主要用于细胞或者组织中某一特定时间全基因组的表达水平的研究。
真核生物成熟的mRNA 3’端会有一段poly A的尾巴,通过带有oligo dT 的磁珠特异性富集细胞或者组织中总mRNA,通过构建mRNA-Seq文库,进行高通量测序,利用生物信息学分析手段,计算出细胞或者组织中不同基因的表达水平。
1.2 RNA-Seq 数据分析流程RNA-Seq数据分析流程包括:测序数据质量的分析;测序数据比对到参考基因组;转录本的组装和拼接,差异表达基因的查找;差异表达基因GO 功能的分析;差异表达基因KEGG pathway 通路的分析;差异表达基因聚类分析等。
图一RNA-Seq 数据分析流程示意图二. RNA-Seq 数据分析结果2.1 测序结果原始数据2.1.1 数据格式我们采用Illumina Hiseq-2500测序平台,测序仪将原始的荧光信号转换成对应的碱基序列。
Package‘qtl2fst’October13,2022Version0.26Date2021-10-07Title Database Storage of Genotype Probabilities for QTL MappingDescription Uses the'fst'package to store genotype probabilities on disk for the'qtl2'pack-age.These genotype probabilities are a central data object for mapping quantita-tive trait loci(QTL),but they can be quite large.The facilities in this package enable the geno-type probabilities to be stored on disk,leading to reduced memory usage with only a modest in-crease in computation time.Author Karl W Broman[aut,cre](<https:///0000-0002-4914-6671>), Brian S Yandell[aut](<https:///0000-0002-8774-9377>),Petr Simecek[aut](<https:///0000-0002-2922-7183>)Maintainer Karl W Broman<***************>Depends R(>=3.1.0)Imports fst,qtl2(>=0.24)Suggests testthat,knitr,rmarkdownVignetteBuilder knitrLicense GPL-3URL https:///rqtl/qtl2fstByteCompile trueEncoding UTF-8RoxygenNote7.1.2NeedsCompilation noRepository CRANDate/Publication2021-10-0712:20:02UTCR topics documented:calc_genoprob_fst (2)cbind.fst_genoprob (4)fst_extract (5)1fst_files (6)fst_genoprob (6)fst_path (8)fst_restore (9)genoprob_to_alleleprob_fst (10)rbind.fst_genoprob (11)replace_path (13)subset_fst_genoprob (14)summary.fst_genoprob (15)Index16 calc_genoprob_fst Calculate conditional genotype probabilities and write to fst databaseDescriptionUses a hidden Markov model to calculate the probabilities of the true underlying genotypes given the observed multipoint marker data,with possible allowance for genotyping errors.Usagecalc_genoprob_fst(cross,fbase,fdir=".",map=NULL,error_prob=0.0001,map_function=c("haldane","kosambi","c-f","morgan"),lowmem=FALSE,quiet=TRUE,cores=1,compress=0,overwrite=FALSE)Argumentscross Object of class"cross2".For details,see the R/qtl2developer guide.fbase Base offilename for fst database.fdir Directory for fst database.map Genetic map of markers.May include pseudomarker locations(that is,locations that are not within the marker genotype data).If NULL,the genetic map incross is used.error_prob Assumed genotyping error probabilitymap_function Character string indicating the map function to use to convert genetic distances to recombination fractions.lowmem If FALSE,split individuals into groups with common sex and crossinfo and thenprecalculate the transition matrices for a chromosome;potentially a lot fasterbut using more memory.quiet If FALSE,print progress messages.cores Number of CPU cores to use,for parallel calculations.(If0,use parallel::detectCores().) Alternatively,this can be links to a set of cluster sockets,as produced by parallel::makeCluster().compress Amount of compression to use(value in the range0-100;lower values meanlargerfile sizes)overwrite If FALSE(the default),refuse to overwrite anyfiles that already exist.DetailsThis is like calling qtl2::calc_genoprob()and then fst_genoprob(),but in a way that hope-fully saves memory by doing it one chromosome at a time.ValueA list containing the attributes of genoprob and the address for the created fst po-nents are:•dim-List of all dimensions of3-D arrays.•dimnames-List of all dimension names of3-D arrays.•is_x_chr-Vector of all is_x_chr attributes.•chr-Vector of(subset of)chromosome names for this object.•ind-Vector of(subset of)individual names for this object.•mar-Vector of(subset of)marker names for this object.•fst-Path and base offile names for the fst database.See Alsoqtl2::calc_genoprob(),fst_genoprob()Exampleslibrary(qtl2)grav2<-read_cross2(system.file("extdata","grav2.zip",package="qtl2"))gmap_w_pmar<-insert_pseudomarkers(grav2$gmap,step=1)fst_dir<-file.path(tempdir(),"grav2_genoprob")dir.create(fst_dir)probs_fst<-calc_genoprob_fst(grav2,"grav2",fst_dir,gmap_w_pmar,error_prob=0.002)#clean up:remove all the files we createdunlink(fst_files(probs_fst))4cbind.fst_genoprob cbind.fst_genoprob Join genotype probabilities for different chromosomesDescriptionJoin multiple genotype probability objects,as produced by fst_genoprob()for different individu-als.Usage##S3method for class fst_genoprobcbind(...,fbase=NULL,fdir=NULL,overwrite=FALSE,quiet=FALSE) Arguments...Genotype probability objects as produced by fst_genoprob().Must have the same set of individuals.fbase Base offileame for fst database.Needed if objects have different fst databases.fdir Directory for fst database.overwrite If FALSE(the default),refuse to overwrite existing.fstfiles.quiet If TRUE,don’t show any messages.Passed to fst_genoprob().ValueA single genotype probability object.See Alsorbind.fst_genoprob()Exampleslibrary(qtl2)grav2<-read_cross2(system.file("extdata","grav2.zip",package="qtl2"))map<-insert_pseudomarkers(grav2$gmap,step=1)probsA<-calc_genoprob(grav2[1:5,1:2],map,error_prob=0.002)probsB<-calc_genoprob(grav2[1:5,3:4],map,error_prob=0.002)dir<-tempdir()fprobsA<-fst_genoprob(probsA,"exampleAc",dir,overwrite=TRUE)fprobsB<-fst_genoprob(probsB,"exampleBc",dir,overwrite=TRUE)#use cbind to combine probabilities for same individuals but different chromosomes fprobs<-cbind(fprobsA,fprobsB,fbase="exampleABc",overwrite=TRUE)#clean up:remove all the files we createdunlink(fst_files(fprobsA))unlink(fst_files(fprobsB))unlink(fst_files(fprobs))fst_extract5 fst_extract Extract genotype probabilities from fst databaseDescriptionExtract genotype probabilities from fst database as an ordinary calc_genoprob object.Usagefst_extract(object)fst2calc_genoprob(object)Argumentsobject Object of class"fst_genoprob",linking to an fst database of genotype proba-bilities.DetailsThe genotype probabilities are extracted from the fst database.Each chromosome is extracted in turn.ValueAn object of class"calc_genoprob"(a list of3-dimensional arrays).Functions•fst2calc_genoprob:Deprecated version(to be deleted)See Alsofst_genoprob()Exampleslibrary(qtl2)grav2<-read_cross2(system.file("extdata","grav2.zip",package="qtl2"))map<-insert_pseudomarkers(grav2$gmap,step=1)probs<-calc_genoprob(grav2,map,error_prob=0.002)dir<-tempdir()fprobs<-fst_genoprob(probs,"grav2",dir,overwrite=TRUE)nprobs<-fst_extract(fprobs)#clean up:remove all the files we createdunlink(fst_files(fprobs))fst_files Listfiles used in fst_genoprob objectDescriptionList all of thefiles used in an fst_genoprob object.Usagefst_files(object)Argumentsobject An object of class"fst_genoprob"as created by fst_genoprob().ValueVector of character strings with the full paths for all of thefiles used for the input object.See Alsofst_path()Exampleslibrary(qtl2)grav2<-read_cross2(system.file("extdata","grav2.zip",package="qtl2"))probs<-calc_genoprob(grav2,error_prob=0.002)dir<-tempdir()fprobs<-fst_genoprob(probs,"grav2",dir,overwrite=TRUE)fst_path(fprobs)fst_files(fprobs)#clean up:remove all the files we createdunlink(fst_files(fprobs))fst_genoprob Store genotype probabilities in fst databaseDescriptionSave an R/qtl2genotype probabilities object to a set of fstfiles for fast access with reduced memory usage.Usagefst_genoprob(genoprob,fbase,fdir=".",compress=0,verbose=TRUE,overwrite=FALSE,quiet=!verbose)Argumentsgenoprob Object of class"calc_genoprob".For details,see the R/qtl2developer guide and qtl2::calc_genoprob().fbase Base offilename for fst database.fdir Directory for fst database.compress Amount of compression to use(value in the range0-100;lower values mean largerfile sizes)verbose Opposite of quiet;deprecated argument(to be removed).overwrite If FALSE(the default),refuse to overwrite anyfiles that already exist.quiet If FALSE(the default),show messages about fst database creation.DetailsThe genotype probabilities are stored in separate databases for each chromosome as tables of(indiv-duals*genotypes)x(positions)in directory fst.The dim,dimnames and is_x_chr elements of the object have information about the entire fst database.If a fst_genoprob object is a subset of an-other such object,the chr,ind,and mar contain information about what is in the subset.However, the fst databases are not altered in a subset,and can be restored by fst_restore().The actual elements of an"fst_genoprob"object are only accessible to the user after a call to unclass();instead the usual access to elements of the object invoke subset.fst_genoprob().ValueA list containing the attributes of genoprob and the address for the created fst po-nents are:•dim-List of all dimensions of3-D arrays.•dimnames-List of all dimension names of3-D arrays.•is_x_chr-Vector of all is_x_chr attributes.•chr-Vector of(subset of)chromosome names for this object.•ind-Vector of(subset of)individual names for this object.•mar-Vector of(subset of)marker names for this object.•fst-Path and base offile names for the fst database.8fst_path Functions•fst_genoprob:Deprecated version(to be deleted)See Alsofst_path(),fst_extract(),fst_files(),replace_path(),fst_restore()Exampleslibrary(qtl2)grav2<-read_cross2(system.file("extdata","grav2.zip",package="qtl2"))map<-insert_pseudomarkers(grav2$gmap,step=1)probs<-calc_genoprob(grav2,map,error_prob=0.002)dir<-tempdir()fprobs<-fst_genoprob(probs,"grav2",dir,overwrite=TRUE)#clean up:remove all the files we createdunlink(fst_files(fprobs))fst_path Path used in fst_genoprob objectDescriptionGet the path used in an fst_genoprob object.Usagefst_path(object)Argumentsobject An object of class"fst_genoprob"as created by fst_genoprob().ValueCharacter string with path(and initialfile stem)forfiles used in the input object.See Alsofst_files(),replace_path()fst_restore9Exampleslibrary(qtl2)grav2<-read_cross2(system.file("extdata","grav2.zip",package="qtl2"))probs<-calc_genoprob(grav2,error_prob=0.002)dir<-tempdir()fprobs<-fst_genoprob(probs,"grav2",dir,overwrite=TRUE)fst_path(fprobs)fst_files(fprobs)#clean up:remove all the files we createdunlink(fst_files(fprobs))fst_restore Restore fst_genoprob object to original dimensions.DescriptionAny"fst_genoprob"object has embedded its original data and dimensions.This resets elementsind,chr and mar to the full set.Usagefst_restore(object)fst_genoprob_restore(object)Argumentsobject Object of class"fst_genoprob"as produced by fst_genoprob().DetailsObject is unclassed and elements ind,chr and mar are changed before reseting attributes as"fst_genoprob"object.See fst_genoprob()for details on the object.ValueInput object with dimensions restored.Functions•fst_genoprob_restore:Deprecated version(to be removed).See Alsofst_genoprob(),fst_extract()10genoprob_to_alleleprob_fstExampleslibrary(qtl2)grav2<-read_cross2(system.file("extdata","grav2.zip",package="qtl2"))map<-insert_pseudomarkers(grav2$gmap,step=1)probs<-calc_genoprob(grav2,map,error_prob=0.002)dir<-tempdir()fprobs<-fst_genoprob(probs,"grav2",dir,overwrite=TRUE)#subset probabilitiesfprobs2<-subset(fprobs,chr=1:2)#use object to get the full probabilities backfprobs5<-fst_restore(fprobs2)#clean up:remove all the files we createdunlink(fst_files(fprobs))genoprob_to_alleleprob_fstConvert genotype probabilities to allele probabilities and write to fstdatabaseDescriptionReduce genotype probabilities(as calculated by calc_genoprob())to allele probabilities,writing them to an fst database.Usagegenoprob_to_alleleprob_fst(probs,fbase,fdir=".",quiet=TRUE,cores=1,compress=0,overwrite=FALSE)Argumentsprobs Genotype probabilities,as calculated from calc_genoprob().fbase Base offilename for fst database.fdir Directory for fst database.quiet IF FALSE,print progress messages.cores Number of CPU cores to use,for parallel calculations.(If0,use parallel::detectCores().) Alternatively,this can be links to a set of cluster sockets,as produced by parallel::makeCluster().compress Amount of compression to use(value in the range0-100;lower values meanlargerfile sizes)overwrite If FALSE(the default),refuse to overwrite anyfiles that already exist.DetailsThis is like calling qtl2::genoprob_to_alleleprob()and then fst_genoprob(),but in a waythat hopefully saves memory by doing it one chromosome at a time.ValueLink to fst database for the probs input with probabilities collapsed to alleles rather than genotypes.See Alsoqtl2::genoprob_to_alleleprob(),fst_genoprob()Exampleslibrary(qtl2)iron<-read_cross2(system.file("extdata","iron.zip",package="qtl2"))gmap_w_pmar<-insert_pseudomarkers(iron$gmap,step=1)#genotype probabilitiesfst_dir<-file.path(tempdir(),"iron_genoprob")dir.create(fst_dir)probs_fst<-calc_genoprob_fst(iron,"iron",fst_dir,gmap_w_pmar,error_prob=0.002)#allele probabilitiesfst_dir_apr<-file.path(tempdir(),"iron_alleleprob")dir.create(fst_dir_apr)aprobs_fst<-genoprob_to_alleleprob_fst(probs_fst,"iron",fst_dir_apr)#clean up:remove all the files we createdunlink(fst_files(probs_fst))unlink(fst_files(aprobs_fst))rbind.fst_genoprob Join genotype probabilities for different individualsDescriptionJoin multiple genotype probability objects,as produced by fst_genoprob()for different individu-als.Usage##S3method for class fst_genoprobrbind(...,fbase=NULL,fdir=NULL,overwrite=FALSE,quiet=FALSE) Arguments...Genotype probability objects as produced by fst_genoprob().Must have the same set of markers and genotypes.fbase Base offileame for fst database.Needed if objects have different fst databases.fdir Directory for fst database.overwrite If FALSE(the default),refuse to overwrite existing.fstfilesquiet If TRUE,don’t show any messages.Passed to fst_genoprob().ValueA single genotype probability object.See Alsocbind.fst_genoprob()Exampleslibrary(qtl2)grav2<-read_cross2(system.file("extdata","grav2.zip",package="qtl2"))map<-insert_pseudomarkers(grav2$gmap,step=1)probsA<-calc_genoprob(grav2[1:5,],map,error_prob=0.002)probsB<-calc_genoprob(grav2[6:12,],map,error_prob=0.002)dir<-tempdir()fprobsA<-fst_genoprob(probsA,"exampleAr",dir,overwrite=TRUE)fprobsB<-fst_genoprob(probsB,"exampleBr",dir,overwrite=TRUE)#use rbind to combine probabilities for same chromosomes but different individuals fprobs<-rbind(fprobsA,fprobsB,fbase="exampleABr")#clean up:remove all the files we createdunlink(fst_files(fprobsA))unlink(fst_files(fprobsB))unlink(fst_files(fprobs))replace_path13 replace_path Replace the path used in fst_genoprob objectDescriptionReplace the path used in an fst_genoprob object.Usagereplace_path(object,path)Argumentsobject An object of class"fst_genoprob"as created by fst_genoprob().path New path(directory+file stem as a single character string)to be used in the object.ValueThe input object with the path replaced.If any of the expectedfiles don’t exist with the new path, warnings are issued.See Alsofst_path(),fst_files()Exampleslibrary(qtl2)grav2<-read_cross2(system.file("extdata","grav2.zip",package="qtl2"))probs<-calc_genoprob(grav2,error_prob=0.002)dir<-tempdir()fprobs<-fst_genoprob(probs,"grav2",dir,overwrite=TRUE)#move the probabilities into a different directorynew_dir<-file.path(tempdir(),"subdir")if(!dir.exists(new_dir))dir.create(new_dir)for(file in fst_files(fprobs)){file.rename(file,file.path(new_dir,basename(file)))}#revise the path in fprobsnew_path<-sub(dir,new_dir,fst_path(fprobs))fprobs<-replace_path(fprobs,new_path)14subset_fst_genoprob subset_fst_genoprob Subsetting genotype probabilitiesDescriptionPull out a specified set of individuals and/or chromosomes from the results of fst_genoprob().Usagesubset_fst_genoprob(x,ind=NULL,chr=NULL,mar=NULL,...)##S3method for class fst_genoprobsubset(x,ind=NULL,chr=NULL,mar=NULL,...)Argumentsx Genotype probabilities as output from fst_genoprob().ind A vector of individuals:numeric indices,logical values,or character string IDs chr A vector of chromosomes:logical values,or character string IDs.Numbers are interpreted as character string IDs.mar A vector of marker names as character string IDs....Ignored.ValueThe input genotype probabilities,with the selected individuals and/or chromsomes.Exampleslibrary(qtl2)grav2<-read_cross2(system.file("extdata","grav2.zip",package="qtl2"))pr<-calc_genoprob(grav2)dir<-tempdir()fpr<-fst_genoprob(pr,"grav2",dir)#keep just individuals1:5,chromosome2prsub<-fpr[1:5,2]#keep just chromosome2prsub2<-fpr[,2]#clean up:remove all the files we createdunlink(fst_files(fpr))summary.fst_genoprob15 summary.fst_genoprob Summary of an fst_genoprob objectDescriptionSummarize an fst_genoprob objectUsage##S3method for class fst_genoprobsummary(object,...)Argumentsobject An object of class"fst_genoprob",as output by fst_genoprob()....Ignored.Exampleslibrary(qtl2)grav2<-read_cross2(system.file("extdata","grav2.zip",package="qtl2"))pr<-calc_genoprob(grav2)dir<-tempdir()fpr<-fst_genoprob(pr,"grav2",dir)#summary of fst_genoprob objectsummary(fpr)#clean up:remove all the files we createdunlink(fst_files(fpr))Index∗utilitiescalc_genoprob_fst,2fst_extract,5fst_genoprob,6fst_restore,9genoprob_to_alleleprob_fst,10subset_fst_genoprob,14summary.fst_genoprob,15calc_genoprob(),10calc_genoprob_fst,2cbind.fst_genoprob,4cbind.fst_genoprob(),12fst2calc_genoprob(fst_extract),5 fst_extract,5fst_extract(),8,9fst_files,6fst_files(),8,13fst_genoprob,6fst_genoprob(),3–6,8,9,11–15fst_genoprob_restore(fst_restore),9 fst_path,8fst_path(),6,8,13fst_restore,9fst_restore(),7,8genoprob_to_alleleprob_fst,10 parallel::detectCores(),3,11 parallel::makeCluster(),3,11qtl2::calc_genoprob(),3,7qtl2::genoprob_to_alleleprob(),11 rbind.fst_genoprob,11rbind.fst_genoprob(),4replace_path,13replace_path(),8subset.fst_genoprob(subset_fst_genoprob),14subset.fst_genoprob(),7subset_fst_genoprob,14summary.fst_genoprob,15unclass(),716。
qtl位点命名规则-回复qtl位点命名规则: 从标记到功能解析引言:在基因组学研究中,描述和解析基因组中的功能位点是至关重要的。
其中一种常用的方法是通过量性状位点(Quantitative Trait Locus,QTL)的定位和功能鉴定来研究复杂性状的遗传基础。
在这个过程中,对QTL位点的命名规则的制定和遵循被认为是有必要的。
本文将深入探讨目前的QTL 位点命名规则,并介绍其主要内容及其在功能解析中的应用。
第一部分: QTL位点的定义和定位QTL是指在某个基因组区域存在影响性状表型变异的定位位点,它通常由分子标记(Marker)和性状(Trait)组成。
标记可以是遗传连锁的位点、分子标记,或者是整个染色体区域。
通过定位QTL位点,我们可以更好地了解某个复杂性状的遗传基础。
第二部分: QTL位点的命名规则为了方便研究者进行数据共享和相互引用,QTL位点需要按照一定的命名规则进行命名。
目前,按照突破科技和逐渐扩大应用的步伐,国际上提出了一些常用的QTL位点命名规则,如:Chromosome-QTL-Marker-DetectionMethod-Year等。
1. 染色体编号在命名QTL位点时,首先需要标明染色体的编号。
常用的染色体编号的方式是按照英文字母进行编号,例如:Chromosome 1表示第一个染色体。
2. QTL标识在QTL位点命名中,通常使用QTL作为标识符,用于指示这是一个定位位点。
例如:qtl1表示第一个QTL位点。
3. 标记在命名规则中,还需要指明该位点所使用的标记。
标记可以是分子标记,例如:Microsatellite Marker或Single Nucleotide Polymorphism(SNP)Marker。
例如:qtl1_MS1表示第一个QTL位点使用了Microsatellite Marker。
4. 检测方法为了标明该QTL位点的检测方法,通常在命名规则中需要包含相应的检测方法缩写。
公共数据库挖掘必备-QTL分析现如今,由于二代测序的普及化、公共数据库的便利化,越来越多的科研工作者可以将不同的组学数据进行大数据的整合分析。
今天,小编就为大家讲解如何进行QTL的分析流程。
首先,我们要理解什么是QTL。
数量性状基因座(Quantitative Trait Loci, QTL)是指染色体上一些能特定调控mRNA(eQTL)、甲基化水平(mQTL)的SNP位点,其mRNA、甲基化的表达水平量与数量性状成比例关系。
有很多文章都用到了QTL的分析,比如小编上一期讲解的“Modulation of long noncoding RNAs by risk SNPs underlying genetic predispositions to prostate cancer”,用到了eQTL的分析;以及“Association and cis-mQTL Analysis of Variants in CHRNA3-A5, CHRNA7, CHRNB2, and CHRNB4 in Relation to Nicotine Dependence in a Chinese Han Population”这篇文章中结合了表达和甲基化数据,做了meQTL的分析。
他们做这些分析的目的,是结合DNA、RNA、甲基化的数据,探索一些和疾病显著关联的intron上的SNP的潜在生物学机制。
比如在“Association and cis-mQTL Analysis of Variants in CHRNA3-A5, CHRNA7, CHRNB2, and CHRNB4 in Relation to Nicotine Dependence in a Chinese Han Population”文章中,作者发现rs3743075和尼古丁成瘾显著相关,并且,这个位点不仅是cis-eQTL,还可以调节临近位点的甲基化水平。
Q-PCR仪器使用具体步骤1.首先将Mx3000P电源打开,大约1min,将会听到启动的声音。
然后打开桌面上的“Mx3000P软件”,确定软件界面右下面的联机标志呈绿色。
如果不是绿色而是红色,软件会提示Instrument to PC communications have failed ,这时只需稍等片刻,待仪器自检完成,会自动连接计算机;2.在“Mx3000P软件”的“New Experiment type”弹出框中选择需要的实验类型:通常选择“SYBR? Green(with dissociationcurve),点击“OK键”;3.确定“Mx3000P软件”界面上面(Selection命令下)的卤钨灯已经打开。
绿色说明已经打开,黄色说明正在预热,红色说明没有打开;4.最好在仪器与光源预热20min后开始实验;5.在“Plate Setup”里,对96孔板的各空进行设定。
设定Standard(标准品)、NTC(阴性对照)、Unknown(样品)等,并选择正确的通道(FAM,HEX,ROX,Cy5).对于标准品,设定其模板浓度,还需要对不同的样品进行编号。
或者点击“Import”,导入以前使用的96孔板设定。
具体为:●首先选择Welltype中的Unknown,然后选择下方的ROX、SYBR;●选择Reference dye的ROX●用Replicate symbol对每个样品进行标记(1,2,3,XXX等)6. 切换至“Thermol pro ”界面,进行PCR 循环的设定。
可以改变各者点击“Import ”, 一步结束时采集荧光信号,采集荧光信号,一般用作溶解曲线分析。
7. “ Thermol pro ”设定结束后,点击软件界面右上方的“Run ”,软件界面右下方会出现运行状态显示框“Run Status ”, 可以选择“Turn lamp off at end of run ”,点击“start ”开始实验,弹出“保存文件的对话框”,选择后保存,弹出另外一个对话框,选择第二个(预热之后跑),另外在运行过程中可通过点击“Add cycles ”来增加扩增的循环数,在PCR 运行结束后软件将自动关闭卤钨灯。
QTL-seq流程说明文档版本号v1.0撰写日期:2017.6.26撰写者:柯文斯目录一、分析流程 (2)二、调参示例 (7)示例:/lustre/Work/project/genome/20170523_Oryza_sativa_BSA/01.QTL-seq/一、工作原理所需文件:1个亲本数据,2个混池数据1.将亲本数据比对到参考基因组,进行snp检测;2.将参考基因组的snp位点进行碱基替换,构建新的reference;3.将亲本数据比对到新的参考基因组,进行snp检测,用于后续混池的过滤;4.将混池数据比对到新的参考基因组进行snp检测,筛选出相对亲本特有的SNP位点;5.对两个混池特有的snp计算出snp-index值,利用窗口滑动的方法结合boost模拟曲线,定位性状关联区域。
二、分析流程1.数据准备对一个亲本、2个混池的fastq文件进行数据链接。
有多个lane数据的,先做数据合并,再链接合并的结果。
合并的参考脚本:zcat L7_1.clean.fq.gz L7_2.clean.fq.gz |gzip > m1_1.clean.fq.gz链接后的fastq文件命名方式:BA_1_1_sequence.txt.gz、BA_1_2_sequence.txt.gz为混池BA的fq1、fq2文件,其中BA_1的“1”是必须的,可以用其他数字代替。
2.设置参数修改配置文件config.txt,根据需要设置相关的参数。
export PATH=${PATH}:/lustre/Work/software/common/fastx_toolkit/bin运行命令:$ ./Bat_make_common.fnc.sh该脚本运行时间很短,直接在本地命令行运行即可。
创建流程所需的目录,生成流程的参数配置文件mon/common.fnc,用于后续调用参数。
3.数据过滤$ cd 1.qualify_read/分别对亲本和两个混池进行数据过滤。
运行命令:$ ./Run_all_Bats.sh 0$ ./Run_all_Bats.sh 1$ ./Run_all_Bats.sh 9这一步运行时间较长,所以需要用qsub投递任务。
任务脚本:run.0.pbs、run.1.pbs、run.9.pbs具体过滤条件:q30p90,即reads中不低于90%的碱基质量值大于30。
选取能配对的reads。
对于两个混池,选取同样大小的数据量,即从数据量较多的一个混池中随机提取与另一个混池相同的数据量。
4.构建reference$ cd 2.make_consensus/运行命令:$ ./Run_all_Bats.sh这一步运行时间较长,所以需要用qsub投递任务。
任务脚本:run.pbs。
具体运行的步骤:利用bwa aln,将亲本过滤后的reads比对参考基因组。
利用coval refine,利用coval call,检测亲本中的SNP、Indel。
对于亲本中检测到的SNP位点,替换参考基因组的碱基,从而得到一个新的reference。
$ cd 90.align_to_this_fasta/运行命令:$ ./Run_all_Bats.sh这一步运行时间较长,所以需要用qsub投递任务。
任务脚本:run.pbs。
具体运行的步骤:利用bwa aln,将亲本过滤后的reads比对新reference。
利用coval refine,利用coval call,检测亲本中的SNP、Indel。
最后得到的结果文件:rice_q30p90_MSR_Cov_10_S-snp.pileup,用于过滤后面分析中的假阳性SNP。
这个文件在起始的配置文件config.txt中的参数名称为PileupDB。
5.检测SNP$ cd 3.alignment/分别对混池A和混池B进行SNP检测。
运行命令:$ ./Run_all_Bats.sh 0$ ./Run_all_Bats.sh 1这一步运行时间较长,所以需要用qsub投递任务。
任务脚本:run.0.pbs、run.1.pbs。
具体运行的步骤:利用bwa aln,将某一混池过滤后的reads比对新reference。
利用coval refine,利用coval call,检测混池中的SNP、Indel。
提取混池和亲本(上一步亲本比对新reference)共同的SNP位点,40.exclude_common_snps/mybulk_BA_q30p90_MSR_Cov_2_S-snp-common-pos.pileup。
提取混池中相对亲本特有的SNP位点,40.exclude_common_snps/mybulk_BA_q30p90_MSR_Cov_2_S-snp-rmc2snp.pileup。
对混池特有的SNP位点进行过滤(最低深度、最小变异质量、SNP-index最小值),得到50.awk_custom/mut_index_2/mybulk_BA_q30p90_cov2_co3.txt。
其中cov2的2是用于Coval mismatch filter的阈值,co3的3是用于深度过滤的阈值。
6.比较两个混池的SNP-index$ cd 4.search_for_pair/运行命令:$ ./Run_all_Bats.sh这一步运行时间较长,所以需要用qsub投递任务。
任务脚本:run.pbs。
具体运行的步骤:对于上一步最后得到的最后结果,将混池A和混池B的SNP位点进行分类,共有的:10.paired_or_unpaired/mut_index_2/paired_mybulk_BA_q30p90_cov2_co3.txt、特有的:10.paired_or_unpaired/mut_index_2/unpaired_mybulk_BA_q30p90_cov2_co3.txt。
将混池A特有的SNP位点与混池B的bam文件进行对比,得到共有的以及特有的SNP位点。
对共有的SNP位点,分别计算在2个混池中的SNP-index (2个混池中均满足最低深度要求)。
将混池A和混池B共有的SNP位点的数据一起整合到上述结果中,得到所有相关SNP位点在两个混池中的信息,40.merge_paired/mut_index_2/merge_mybulk_BABm_q30p90_paired_cov2_co3.txt7.比较两个混池的SNP-index$ cd pare/运行命令:./Run_all_Bats.sh这一步运行时间较长,所以需要用qsub投递任务。
任务脚本:run.pbs。
具体运行的步骤:模拟数据,计算置信区间的值。
对所有相关的SNP位点,计算不同置信区间的值。
过滤结果(最低深度、SNP-index的最大值及最小值)。
滑窗口,计算每个窗口的SNP-index。
8.文件格式文件:mybulk_BA_q30p90_MSR_Cov_2_S-snp-rmc2snp.pileupfield# Description1 chromosome2 coordinate3 reference base4 consensus base5 consensus quality6 SNP quality7 mapping quality8 the number of reads covering the site9 read bases10 base qualities11 SNP-index文件:filtered_merge_mybulk_BABm_q30p90_paired_pvalue_sldwnd2M50K_cov2_co5.txtfield# Description1 chromosome2 coordinate = window center position (incremented sliding shift)3 actual width of window4 sliding window average of the field 8th in (type4); bulk-A’s depth5 sliding window average of the field 9th in(type4); bulk-A’s snp-index6 sliding window average of the field 19th in(type4); bulk-B’s depth7 sliding window average of the field 20th in (type4); bulk-B’s snp-index8 sliding window average of the field 23rd in (type4); smaller depth9 sliding window average of the field 24nd in (type4); delta (SNP-index)10 sliding window average of the field 30th in (type4); U9511 sliding window average of the field 31st in (type4); L9512 sliding window average of the field 32nd in (type4); < L9513 sliding window average of the field 33rd in (type4); L95~U9514 sliding window average of the field 34th in (type4); > U9515 sliding window average of the field 35th in (type4); U9916 sliding window average of the field 36th in (type4); L9917 sliding window average of the field 37th in (type4); < L9918 sliding window average of the field 38th in (type4); L99 ~ U9919 sliding window average of the field 39th in (type4); > U9920 SNPs counts in this window三、调参示例1.深度阈值调参第一步:修改config,txt文件第二步:运行程序./Bat_make_common.fnc.sh这时确认配置文件mon/common.fnc中深度参数是否调整过来:第三步:进入pare文件夹,运行程序:./Run_all_Bats.sh(运行时间较长)即可得到不同深度阈值对应的定位区域了。