###参考:https://www.biostars.org/p/163356/

used TopHat to map my reads against their relative reference genome.


When I look inside prep_reads.info, I see:

  • left_min_read_len=90
  • left_max_read_len=90
  • left_reads_in =24995053
  • left_reads_out=24994132
  • right_min_read_len=90
  • right_max_read_len=90
  • right_reads_in =24995053
  • right_reads_out=24994422

Then when I open align_summary.txt, I see:

Left reads:
               Input:  24995053
             Mapped:  22715900 (90.9% of input)
            of these:   2106892 ( 9.3%) have multiple alignments (89 have >20)
Right reads:
               Input:  24995053
              Mapped:  22310498 (89.3% of input)
            of these:   2088630 ( 9.4%) have multiple alignments (148 have >20)
90.1% overall read alignment rate.

Aligned pairs:  21074559
     of these:   1469415 ( 7.0%) have multiple alignments
          and:    107380 ( 0.5%) are discordant alignments
83.9% concordant pair alignment rate.


In align_summary.txt I know the changes between "Input" number and "Mapped" is because some of reads are unmapped to reference genome. ^Ok^.

But for prep_reads.info I do not know why "_reads_out" numbers are different from "_reads_in" numbers and If this difference is due to unmapped reads, why the difference is not equal to difference between the Input number and Mapped number in align_summary.txt?

<caption>Differences</caption>

  prep_reads.info align_summary.txt
left 24995053-24994132=921 24995053-22715900=2279153

right

24995053-24994422=631

24995053-22310498=2684555

The difference is due to filtering for things such as read length. Some reads are too short, so they're excluded. This occurs before any mapping takes place.

I seeeeeee. I did not know thaaat. I thought we can eliminate short reads only by trimmomatic (MINLEN). I did not know mapping tools also eliminate some reads.

 

Well, "things such as read length". It's filtering for other things too. In your case, one of these "other things" is what's causing additional reads to get dropped, since your input is all 90 bases

1、Question: prep_reads.info vs. align_summary.txt的更多相关文章

  1. 2、Tophat align_summary.txt and samtools flagstat accepted_hits.bam disagree

    ###https://www.biostars.org/p/195758/ Left reads: Input : 49801387 Mapped : 46258301 (92.9% of input ...

  2. 【Linux】【一】linux 目录切换、创建目录和文件、编辑目录以及文件(txt)

    以下 是在指定目录下创建文件夹目录,以及在该目录下创建txt文件进行编辑,保存. 然后删除相关文件以及目录的命令操作记录. 本操作记录中的命令简单解释: pwd 显示当前路径 ls 显示当前目录下的文 ...

  3. 爬虫-----爬取所有国家的首都、面积 ,并保存到txt文件中

    # -*- coding:utf-8 -*- import urllib2import lxml.htmlfrom lxml import etree def main(): file = open( ...

  4. 8、显示程序占用内存多少.txt

    方法一: 要加单元 PsAPI procedure TForm1.tmr1Timer(Sender: TObject); begin edt1.Text:= format('memory use: % ...

  5. 网站迁移服务器后CPU、内存飙升,设置robots.txt 问题

    User-agent: SemrushBotDisallow: /User-agent: SemrushBot-SADisallow: /User-agent: SemrushBot-BADisall ...

  6. 『动善时』JMeter基础 — 26、使用txt文件实现JMeter参数化

    目录 1.测试计划中的元件 2.数据文件内容 3.线程组元件内容 4.HTTP信息头管理器组件内容 5.CSV数据文件设置组件内容 6.HTTP请求组件内容 7.脚本运行结果 之前我们都是使用.csv ...

  7. jmeter分布式导致重复登录的问题、以及写txt、csv、统计行数

    经常收到微信好友的各种问题咨询,今天分享一个比较有代表性的,希望对大家有所帮助. 一位微信好友的提问 问题如下: 问题分析 先简单介绍下服务端的处理逻辑,关于登录,服务端的逻辑一般是:校验用户名.密码 ...

  8. mysql命令行的导入导出sql,txt,excel(都在linux或windows命令行操作)(转自筑梦悠然)

    原文链接https://blog.csdn.net/wuhuagu_wuhuaguo/article/details/73805962 Mysql导入导出sql,txt,excel 首先我们通过命令行 ...

  9. python基础之迭代器、装饰器、软件开发目录结构规范

    生成器 通过列表生成式,我们可以直接创建一个列表.但是,受到内存限制,列表容量肯定是有限的.而且,创建一个包含100万个元素的列表,不仅占用很大的存储空间,如果我们仅仅需要访问前面几个元素,那后面绝大 ...

随机推荐

  1. CentOS下查看文件和文件夹大小

    当磁盘大小超过标准时会有报警提示,这时如果掌握df和du命令是非常明智的选择. df可以查看一级文件夹大小.使用比例.档案系统及其挂入点,但对文件却无能为力. 当磁盘大小超过标准时会有报警提示,这时如 ...

  2. window.onload 添加多个函数绑定

    window.onload = function(){alert(2)} function addEvent (fun) { var old = window.onload; if(typeof ol ...

  3. javascript时间戳转换成指定格式的日期

    //时间戳转换成指定格式的日期DateTool.IntDatetimeTo = function(time, format){    var testDate = new Date(time);    ...

  4. POJ 1577 Falling Leaves(二叉搜索树)

    思路:当时学长讲了之后,似乎有点思路----------就是倒着建一个  二叉搜索树 代码1:超时 详见超时原因 #include<iostream> #include<cstrin ...

  5. Unity3D之Mesh(三)绘制四边形

    前言: 由於之前的基本介紹,所以有關的知識點不做贅述,只上案例,知識作爲自己做試驗的記錄,便於日後查看. 步驟: 1.創建一個empty 的gameobject: 2.添加一個脚本給這個game ob ...

  6. A N EAR -D UPLICATE D ETECTION A LGORITHM T O F ACILITATE D OCUMENT C LUSTERING——有时间看看里面的相关研究

    摘自:http://aircconline.com/ijdkp/V4N6/4614ijdkp04.pdf In the syntactical approach we define binary at ...

  7. Java_util_02_Java判断字符串是中文还是英文

    做微信开发,使用百度翻译API时,需要指定译文的语种.这就需要我们判断待翻译内容是中文还是英文,若是中文,则翻译成英文,若是英文则翻译成中文. 方法一:字符与字节的长度 依据:一个中文占两个字节,一个 ...

  8. D唐纳德和他的数学老师(华师网络赛)(二分匹配,最大流)

    Time limit per test: 1.0 seconds Memory limit: 256 megabytes 唐纳德是一个数学天才.有一天,他的数学老师决定为难一下他.他跟唐纳德说:「现在 ...

  9. django autocommit的一个坑,读操作的事务占用导致锁表

    版权归作者所有,任何形式转载请联系作者.作者:petanne(来自豆瓣)来源:https://www.douban.com/note/580618150/ 缘由:有一个django守护进程Daemon ...

  10. Ubuntu下locale文件

    March 7, 2015 11:44 PM locale文件 关于locale文件的设定 locale 是国际化与本土化过程中的一个非常重要的概念,个人认为,对于中文用户来说,通常会涉及到的国际化或 ...