参考:https://f1000research.com/articles/4-1521/v1

https://www.biostars.org/p/171766/

http://www.rna-seqblog.com/rpkm-fpkm-and-tpm-clearly-explained/

It used to be when you did RNA-seq, you reported your results in RPKM (Reads Per Kilobase Million) or FPKM (Fragments Per Kilobase Million). However, TPM (Transcripts Per Kilobase Million) is now becoming quite popular.

============================fpkm====================================

rate = geneA_count / geneA_length

fpkm = rate / (sum(gene*_count) /10^6)

即: fpkm = 10^6 * (geneA_count / geneA_length)  /  sum(gene*_length)   ##sum(gene*_length) 没有标准化处理的所有基因的count总和。

============================TPM====================================

rate = geneA_count / geneA_length

tpm = rate / (sum(rate) /10^6)

即: tpm = 10^6 * (geneA_count / geneA_length)  /  sum(rate)   ##sum(gene*_length)

====================================================================

These three metrics attempt to normalize for sequencing depth and gene length. Here’s how you do it for RPKM:

  1. Count up the total reads in a sample and divide that number by 1,000,000 – this is our “per million” scaling factor.
  2. Divide the read counts by the “per million” scaling factor. This normalizes for sequencing depth, giving you reads per million (RPM)
  3. Divide the RPM values by the length of the gene, in kilobases. This gives you RPKM.

FPKM is very similar to RPKM. RPKM was made for single-end RNA-seq, where every read corresponded to a single fragment that was sequenced. FPKM was made for paired-end RNA-seq. With paired-end RNA-seq, two reads can correspond to a single fragment, or, if one read in the pair did not map, one read can correspond to a single fragment. The only difference between RPKM and FPKM is that FPKM takes into account that two reads can map to one fragment (and so it doesn’t count this fragment twice).

TPM is very similar to RPKM and FPKM. The only difference is the order of operations. Here’s how you calculate TPM:

  1. Divide the read counts by the length of each gene in kilobases. This gives you reads per kilobase (RPK).
  2. Count up all the RPK values in a sample and divide this number by 1,000,000. This is your “per million” scaling factor.
  3. Divide the RPK values by the “per million” scaling factor. This gives you TPM.

So you see, when calculating TPM, the only difference is that you normalize for gene length first, and then normalize for sequencing depth second. However, the effects of this difference are quite profound.

When you use TPM, the sum of all TPMs in each sample are the same. This makes it easier to compare the proportion of reads that mapped to a gene in each sample. In contrast, with RPKM and FPKM, the sum of the normalized reads in each sample may be different, and this makes it harder to compare samples directly.

Here’s an example. If the TPM for gene A in Sample 1 is 3.33 and the TPM in sample B is 3.33, then I know that the exact same proportion of total reads mapped to gene A in both samples. This is because the sum of the TPMs in both samples always add up to the same number (so the denominator required to calculate the proportions is the same, regardless of what sample you are looking at.)

With RPKM or FPKM, the sum of normalized reads in each sample can be different. Thus, if the RPKM for gene A in Sample 1 is 3.33 and the RPKM in Sample 2 is 3.33, I would not know if the same proportion of reads in Sample 1 mapped to gene A as in Sample 2. This is because the denominator required to calculate the proportion could be different for the two samples.

39、count_rpkm_fpkm_TPM的更多相关文章

  1. iTOP-4418开发板支持Android4.4/5.1.1系统、Linux3.4.39、QT2.2/4.7/5.7、Ubuntu12.04

    核心板参数 尺寸:50mm*60mm 高度:核心板连接器组合高度1.5mm PCB层数:6层PCB沉金设计 4418 CPU:ARM Cortex-A9 四核 S5P4418处理器 1.4GHz 68 ...

  2. 39、扩展原理-BeanFactoryPostProcessor

    39.扩展原理-BeanFactoryPostProcessor BeanPostProcessor:bean后置处理器,bean创建对象初始化前后进行拦截工作的 BeanFactoryPostPro ...

  3. EC读书笔记系列之16:条款35、36、37、38、39、40

    条款35 考虑virtual函数以外的其他选择 记住: ★virtual函数的替代方案包括NVI手法及Strategy模式的多种形式.NVI手法自身是一个特殊形式的Template Method模式 ...

  4. 常见条码类型介绍(Code 39、Code 128、EAN-8、EAN-13、EAN-128、ISSN、TIF、TIF-14、UPC(A)、UPC(E))

    常见条码类型,如下: 1.Code 39 Code 39,又称为"Code 3 of 9",是非零售市场中最常用的格式,用于盘存和跟踪.Code 39码编码规则简单,误码率低.所能 ...

  5. 39、wget、curl

    39.1.wget介绍: wget命令用来从指定的URL下载文件.wget非常稳定,它在带宽很窄的情况下和不稳定网络中有很强的适应性,如果是由于网络的原因下载失败, wget会不断的尝试,直到整个文件 ...

  6. 39、升级linux的内核

    39.1.什么是linux系统内核: 操作系统是一个用来和硬件打交道并为用户程序提供一个有限服务集的低级支撑软件.一个计算机 系统是一个硬件和软件的共生体,它们互相依赖,不可分割.计算机的硬件,含有外 ...

  7. 39、mysql数据库(视图)

    39.1.视图: 0.创建表及插入数据: 1.创建teacher表及插入数据: (1)创建表: CREATE TABLE teacher( tid int PRIMARY KEY auto_incre ...

  8. 『现学现忘』Docker基础 — 39、实战:自定义Tomcat9镜像

    目录 1.目标 2.准备 3.编写Dockerfile文件 4.构建镜像 5.启动镜像 6.验证容器是否能够访问 7.向容器中部署WEB项目,同时验证数据卷挂载 (1)准备一个简单的WEB项目 (2) ...

  9. 39、重新复习js之三

    1.盒子模型典型标签 <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http:// ...

随机推荐

  1. LeetCode OJ:Path Sum(路径之和)

    Given a binary tree and a sum, determine if the tree has a root-to-leaf path such that adding up all ...

  2. 基于zepto移动4*3九宫格转奖

    最近根据公司需求,要把移动端的圆形转盘抽奖,改为九宫格的形式,查找资料搞定了,纪录下demo代码. 页面的展现样式,如下 比较简单,就是红色的背景图,在这10个格子里转动 具体代码如下 html &l ...

  3. python与mongodb

    一.mongodb的原理介绍: 特点: 为了理解以上特点,我们从一个真实的场景出发,介绍mongodb的原理:参考视频:https://www.youtube.com/watch?v=4SxHNmk5 ...

  4. 【java规则引擎】模拟rete算法的网络节点以及匹配过程

    转载请注明:http://www.cnblogs.com/shangxiaofei/p/6340655.html 本文只用于理解rete算法,通过一个规则的编译成的网络结构,以及匹配过程去理解rete ...

  5. bzoj 3887: Grass Cownoisseur Tarjan+Topusort

    题目: 给一个有向图,然后选一条路径起点终点都为1的路径出来,有一次机会可以沿某条边逆方向走,问最多有多少个点可以被经过?(一个点在路径中无论出现多少正整数次对答案的贡献均为1) 题解: 首先考虑简单 ...

  6. 【数论】卡塔兰数 Catalan

    一.简介 设$h(0)=1$,$h(1)=1$,Catalan数满足递推式 $h(n) = h(0) \ast h(n-1) + h(1)\ast h(n-2) + \cdots + h(n-1)\a ...

  7. AngularJs1使用中出现错误 Error: [ng:areq]

    1.没有对应的控制器 2.有控制器但是路径没有配对

  8. ubuntu tftp server config

    1.安装tftp-server sudo apt-get install tftpd-hpa sudo apt-get install tftp-hpa(如果不需要客户端可以不安装) tftp-hpa ...

  9. TIJ摘要:访问控制权限

    重构的原动力之一:发现有更好的方式去实现相同的功能. OOP需要考虑的基本问题:如何把变动的事物与不变的事物区分开来. 访问控制权限:以供类库开发人员向客户端程序员指明哪些是可用的,哪些是不可用的.访 ...

  10. Azure上批量创建OS Disk大于30G的Linux VM

    Azure上VM的OS盘的大小在创建时是固定的.Windows是127G,Linux是30G.如果需要批量创建的VM的OS Disk有更大的容量.可以考虑用下面的方法实现. 1 创建一台有Data-d ...