What is TopHat?

TopHat is a program that aligns RNA-Seq reads to a genome in order to identify exon-exon splice junctions. It is built on the ultrafast short read mapping program Bowtie. TopHat runs on Linux and OS X.

How does TopHat find junctions?

TopHat can find splice junctions without a reference annotation. By first mapping RNA-Seq reads to the genome, TopHat identifies potential exons, since many RNA-Seq reads will contiguously align to the genome. Using this initial mapping information, TopHat builds a database of possible splice junctions and then maps the reads against these junctions to confirm them.

Short read sequencing machines can currently produce reads 100bp or longer but many exons are shorter than this so they would be missed in the initial mapping. TopHat solves this problem mainly by splitting all input reads into smaller segments which are then mapped independently. The segment alignments are put back together in a final step of the program to produce the end-to-end read alignments.

TopHat generates its database of possible splice junctions from two sources of evidence. The first and strongest source of evidence for a splice junction is when two segments from the same read (for reads of at least 45bp) are mapped at a certain distance on the same genomic sequence or when an internal segment fails to map - again suggesting that such reads are spanning multiple exons. With this approach, "GT-AG", "GC-AG" and "AT-AC" introns will be found ab initio. The second source is pairings of "coverage islands", which are distinct regions of piled up reads in the initial mapping. Neighboring islands are often spliced together in the transcriptome, so TopHat looks for ways to join these with an intron. We only suggest users use this second option (--coverage-search)  for short reads (< 45bp) and with a small number of reads (<= 10 million).  This latter option will only report alignments across "GT-AG" introns

TopHat的更多相关文章

  1. tophat输出结果junction.bed

    tophat输出结果junction.bed BED format       BED format provides a flexible way to define the data lines ...

  2. tophat安装

    1     依赖软件:bowtie,bowtie2,samtools,boost c++ library 2     建立索引文件:      bowtie包括bowtie,bowtie-build, ...

  3. RNA-seq差异表达基因分析之TopHat篇

    RNA-seq差异表达基因分析之TopHat篇 发表于2012 年 10 月 23 日 TopHat是基于Bowtie的将RNA-Seq数据mapping到参考基因组上,从而鉴定可变剪切(exon-e ...

  4. 机器学习进阶-图像形态学变化-礼帽与黑帽 1.cv2.TOPHAT(礼帽-原始图片-开运算后图片) 2.cv2.BLACKHAT(黑帽 闭运算-原始图片)

    1.op = cv2.TOPHAT  礼帽:原始图片-开运算后的图片 2. op=cv2.BLACKHAT 黑帽: 闭运算后的图片-原始图片 礼帽:表示的是原始图像-开运算(先腐蚀再膨胀)以后的图像 ...

  5. 使用Tophat+cufflinks分析差异表达

    使用Tophat+cufflinks分析差异表达  2017-06-15 19:09:43     522     0     0 使用TopHat+Cufflinks的流程图 序列的比对是RNA分析 ...

  6. RNA-seq 安装 fastaqc,tophat,cuffilnks,hisat2

    ------------------------------------------ 安装fastqc------------------------------------------------- ...

  7. tophat的用法

    概述:tophat是以bowtie2为核心的一款比对软件. tophat工作分两步: 1.将reads用bowtie比对到参考基因组上. 2.将unmapped-reads打断成更小的fragment ...

  8. SAGE|DNA微阵列|RNA-seq|lncRNA|scripture|tophat|cufflinks|NONCODE|MA|LOWESS|qualitile归一化|permutation test|SAM|FDR|The Bonferroni|Tukey's|BH|FWER|Holm's step-down|q-value|

    生物信息学-基因表达分析 为了丰富中心法则,研究人员使用不断更新的技术研究lncRNA的方方面面,其中技术主要是生物学上的微阵列芯片技术和表达数据分析方法,方方面面是指lncRNA的位置特征. Bac ...

  9. 链终止法|边合成边测序|Bowtie|TopHat|Cufflinks|RPKM|FASTX-Toolkit|fastaQC|基因芯片|桥式扩增|

    生物信息学 Sanger采用链终止法进行测序 带有荧光基团的ddXTP+其他四种普通的脱氧核苷酸放入同一个培养皿中,例如带有荧光基团的ddATP+普通的脱氧核苷酸A.T.C.G放入同一个培养皿,以此类 ...

随机推荐

  1. WPF Tookit Chart

      如何使用Chart 实例: Binding数据源中是一个KeyValuePair对象.可以是Dictionary. <charting:Chart x:Name="chtSumma ...

  2. OpenStack 企业私有云的若干需求(2):自动扩展(Auto-scaling) 支持

    本系列会介绍OpenStack 企业私有云的几个需求: 自动扩展(Auto-scaling)支持 多租户和租户隔离 (multi-tenancy and tenancy isolation) 混合云( ...

  3. UVA数学入门训练Round1[6]

    UVA - 11388 GCD LCM 题意:输入g和l,找到a和b,gcd(a,b)=g,lacm(a,b)=l,a<b且a最小 g不能整除l时无解,否则一定g,l最小 #include &l ...

  4. 【每日一linux命令6】命令中的命令

    许多命令在执行后,会进入该命令的操作模式,如 fdisk.pine.top 等,进入后我们必须要使用该 命令中的命令,才能正确执行:而一般要退出该命令,可以输入 exit.q.quit 或是按[Ctr ...

  5. 如何保证ArrayList线程安全

    一.继承Arraylist,然后重写或按需求编写自己的方法,这些方法要写成synchronized,在这些synchronized的方法中调用ArrayList的方法.   二:使用Collectio ...

  6. 可运行jar包的几种打包/部署方式

    java项目开发中,最终生成的jar,大概可分为二类,一类是一些通用的工具类(不包含main入口方法),另一类是可直接运行的jar包(有main入口方法),下面主要讲的是后者,要让一个jar文件可直接 ...

  7. Qt——一些工具的使用

    一.使用Qt需要安装哪些软件 如果不使用VS,那么只需Qt组件就行了,安装完成后使用QtCreator进行编程. 如果使用VS,则需要安装下面几个: 1.Visual Studio 2.Qt组件 3. ...

  8. 在Linux中如何使用命令进行RS-232串口通信和数据包解析

    文章首发于浩瀚先森博客 1. 获取串口号 在Linux系统中一切皆为文件,所以串口端口号也不例外,都是以设备文件的形式出现.也就是说我们可以用访问文本文件的命令来访问它们. a. 一般串口都是以/de ...

  9. 20145208《信息安全系统设计基础》实验五 简单嵌入式WEB 服务器实验

    20145208<信息安全系统设计基础>实验五 简单嵌入式WEB 服务器实验 20145208<信息安全系统设计基础>实验五 简单嵌入式WEB 服务器实验

  10. js实现由分隔栏决定两侧div的大小—js动态分割div

    /*================index.html===================*/ <!DOCTYPE html><html lang="zh-cn&quo ...