6、RNA-Seq Analysis Pipeline
Created by Dhivya Arasappan, last modified by Dennis C Wylie on Nov 08, 2015
1. Quality Assessment
Quality of data assessed by FastQC; results of quality assessment will be evaluated prior to downstream analysis.
- Deliverables:
- reports generated by FastQC
- Tools used:
- FastQC: (Andrews 2010) used to generate quality summaries of data:
- Per base sequence quality report: useful for deciding if trimming necessary.
- Sequence duplication levels: evaluation of library complexity. Higher levels of sequence duplication may be expected for high coverage RNAseq data.
- Overrepresented sequences: evaluation of adapter contamination.
- FastQC: (Andrews 2010) used to generate quality summaries of data:
2. Fastq Preprocessing
Quality assessment used to decide if any preprocessing of the raw data is required and if so, preprocessing is performed.
- Deliverables:
- Trimmed/filtered fastq files.
- Tools Used:
- Fastx-toolkit: Used to preprocess fastq files.
- Fastq quality trimmer: Trimming reads based on quality.
- Fastq quality filter: Filtering reads based on quality.
- Cutadapt: Used to remove adaptor from reads.
- Fastx-toolkit: Used to preprocess fastq files.
3. Mapping
Mapping to genome reference performed using BWA-mem or Tophat.
- Deliverables:
- Mapping results, as bam files and mapping statistics.
- Tools Used:
- BWA-mem: (Li 2013) primary aligner used to generate read alignments.
- Tophat: (Kim 2011) aligner used to generate read alignments in a splice-aware manner and identify novel junctions.
- Samtools: (Li 2009) used to generate mapping statistics.
4. Gene/Transcript Counting
Counting the number of reads mapping to annotated intervals to obtain abundance of genes/transcripts.
- Deliverables:
- Raw gene/transcript counts
- Tools Used:
- HTSeq-count: (Anders 2014) used to count reads overlapping gene intervals.
5. DEG Identification
Normalization and statistical testing to identify differentially expressed genes.
- Deliverables:
- DEG Summary and master file containing fold changes and p values for every gene, MA Plots.
- Tools Used:
- DESeq2: (Love 2014) used to perform normalization and test for differential expression using the negative binomial distribution.
6、RNA-Seq Analysis Pipeline的更多相关文章
- RNA -seq
RNA -seq RNA-seq目的.用处::可以帮助我们了解,各种比较条件下,所有基因的表达情况的差异. 比如:正常组织和肿瘤组织的之间的差异:检测药物治疗前后,基因表达的差异:检测发育过程中,不同 ...
- RNA seq 两种计算基因表达量方法
两种RNA seq的基因表达量计算方法: 1. RPKM:http://www.plob.org/2011/10/24/294.html 2. RSEM:这个是TCGAdata中使用的.RSEM据说比 ...
- Power BI 与 Azure Analysis Services 的数据关联:1、建立 Azure Analysis Services服务
Power BI 与 Azure Analysis Services 的数据关联:1.建立 Azure Analysis Services服务
- xgene:之ROC曲线、ctDNA、small-RNA seq、甲基化seq、单细胞DNA, mRNA
灵敏度高 == 假阴性率低,即漏检率低,即有病人却没有发现出来的概率低. 用于判断:有一部分人患有一种疾病,某种检验方法可以在人群中检出多少个病人来. 特异性高 == 假阳性率低,即错把健康判定为病人 ...
- Scrapy框架——介绍、安装、命令行创建,启动、项目目录结构介绍、Spiders文件夹详解(包括去重规则)、Selectors解析页面、Items、pipelines(自定义pipeline)、下载中间件(Downloader Middleware)、爬虫中间件、信号
一 介绍 Scrapy一个开源和协作的框架,其最初是为了页面抓取 (更确切来说, 网络抓取 )所设计的,使用它可以以快速.简单.可扩展的方式从网站中提取所需的数据.但目前Scrapy的用途十分广泛,可 ...
- 7、RNAseq Downstream Analysis
Created by Dennis C Wylie, last modified on Jun 29, 2015 Machine learning methods (including cluster ...
- 五、Scrapy中Item Pipeline的用法
本文转载自以下链接: https://scrapy-chs.readthedocs.io/zh_CN/latest/topics/item-pipeline.html https://doc.scra ...
- 09、RNA降解图的计算过程
RNA降解是影响芯片质量的一个很重要的因素,因为RNA是从5’开始降解的,所以理论5’的荧光强度要低于3’.RNA降解曲线可以表现这种趋势. 以样品GSM286756.CEL和GSM286757.CE ...
- RNA测序相对基因表达芯片有什么优势?
RNA测序相对基因表达芯片有什么优势? RNA-Seq和基因表达芯片相比,哪种方法更有优势?关键看适用不适用.那么RNA-Seq适用哪些研究方向?是否您的研究?来跟随本文了解一下RNA测序相对基因表达 ...
随机推荐
- CSS3环形动画菜单
在线演示 本地下载
- Docker 单机网络
Docker Network相关命令 root@ubuntu:~# docker network --help Usage: docker network COMMAND Manage network ...
- DBGrid和DBGridEH
二.应用实例 Enlib3.0组件包安装成功后 A.定制标题行 1.制作复杂标题行 标题行可设为2行以上高度,并可以为多列创建一个共同的父标题行.为实现这个效果,需在各个列标题属性中以“|”分隔父标题 ...
- window.showModalDialog()之返回值
window.showModalDialog的基本用法 showModalDialog() (IE 4+ 支持) showModelessDialog() (IE 5+ 支持) window.show ...
- UVA 10158 War(并查集)
//思路详见课本 P 214 页 思路:直接用并查集,set [ k ] 存 k 的朋友所在集合的代表元素,set [ k + n ] 存 k 的敌人 所在集合的代表元素. #include< ...
- Java微信开发_03_使用测试号进行开发
今天进行自定义菜单的开发时,发现公众号没有自定义菜单的权限.于是想到用测试号,但微信服务器如何区分你要请求的是公众号还是测试号呢. 我们可以发现不同公众号的appID和appsecre是不同的,一对a ...
- PHP如何得到数组最后元素的key
1.array_keys(end($arr)) $array = array( 'one'=>1, 'two'=>2, 'three'=>3, 'four'=>4, ); $a ...
- hdu4699 Editor(双向链表或双栈对弹)
本题就是两个要点: 1.数据结构的设计.显然可以使用双向链表来做,但是写双向链表的代码复杂度高.其实更好的方法是使用两个对弹的栈来做,而且没必要用STL的栈,就自己开两个数组简单搞一下就好了. 2.最 ...
- 1014 Waiting in Line (30)(30 分)
Suppose a bank has N windows open for service. There is a yellow line in front of the windows which ...
- Oracle12c多租户如何启动关闭CDB或PDB (PDB自动启动)
Oracle 数据库 12 c 中介绍了多租户选项允许单个容器数据库 (CDB) 来承载多个单独的可插拔数据库 (PDB).下面我们一起来启动和关闭容器数据库 (CDB) 和可插拔数据库 (PDB). ...