Variation calling and annotation
本文摘自《Resequencing 302 wild and cultivated accessions identifies genes related to domestication and improvement in soybean》
Variation calling and annotation.
Mapping.
SAMtools (Version: 0.1.18) software was used to convert mapping results into the BAM format and to filter the unmapped and non-unique reads.
Duplicated reads were filtered with the Picard package (picard.sourceforge.net, Version:1.87).
The BEDtools (Version: 2.17.0) coverageBed program was used to compute the coverage of sequence alignments. (A sequence was defined as absent if coverage was lower than 90% and present if coverage was greater than 90%.)
SNP calling.
SNP detection was performed using the Genome Analysis Toolkit (GATK, version 2.4-7-g5e89f01) and SAMtools. Only the SNPs detected by both methods were analyzed further.
The detailed processes were as follows:
(1) After BWA alignment, the reads around indels were realigned.
Realignment was performed with GATK in two steps.
The first step used the RealignerTargetCreator package to identify regions where realignment was needed;
The second step used IndelRealigner to realign the regions found in the first step, which produced a realigned BAM file for each accession.
(2) SNPs were called at a population level with GATK and SAMtools. For GATK, the SNP confidence score was set as greater than 30, and the parameter -stand_call_conf was set as 30. The same realigned BAM files were used in SNP calling through the SAMtools mpileup package.
(3) In the filter step, we chose the common sites identified by GATK and SAMtools with the SelectVariants package; SNPs with allele frequencies lower than 1% in the population were discarded.
Indel calling.
Indel calling was similar to SNP calling but with the UnifiedGenotyper parameter -glm INDEL for the indel report only. Only insertions and deletions shorter than or equal to 6 bp were taken into account.
Annotation.
SNP annotation was performed according to the genome using the package ANNOVAR (Version: 2013-08-23).
Based on the genome annotation, SNPs were categorized in exonic regions (overlapping with a coding exon), splicing sites (within 2 bp of a splicing junction), 5′UTRs and 3′UTRs, intronic regions (overlapping with an intron), upstream and downstream regions (within a 1 kb region upstream or downstream from the transcription start site), and intergenic regions.SNPs in coding exons were further grouped into synonymous SNPs (did not cause amino acid changes) or nonsynonymous SNPs (caused amino acid changes; mutations causing stop gain and stop loss were also classified into this group).
Indels in the exonic regions were classified by whether they had frame-shift (3 bp insertion or deletion) mutations.
Variation calling and annotation的更多相关文章
- 敏感性、特异性、假阳性、假阴性(sensitivity and specificity)
医学.机器学习等等,在统计结果时时长会用到这两个指标来说明数据的特性. 定义 敏感性:在金标准判断有病(阳性)人群中,检测出阳性的几率.真阳性.(检测出确实有病的能力) 特异性:在金标准判断无病(阴性 ...
- 30、 bowtie和bowtie2使用条件区别及用法
转载:http://blog.csdn.net/soyabean555999/article/details/62235577 一.转录组还是基因组? map常用的工具有bowtie/bowtie2, ...
- 表观 | Enhancer | ChIP-seq | 转录因子 | 数据库专题
需要长期更新! 参考:生信修炼手册 enhancer的基本概念: 长度几十到几千bp,作用是提高靶基因活性,属于顺式作用原件,DNA作用到DNA,转录因子就是反式,是结合到DNA的蛋白. 1981年, ...
- ANNOTATION PROCESSING 101 by Hannes Dorfmann — 10 Jan 2015
原文地址:http://hannesdorfmann.com/annotation-processing/annotationprocessing101 In this blog entry I wo ...
- Spring Annotation Processing: How It Works--转
找的好辛苦呀 原文地址:https://dzone.com/articles/spring-annotation-processing-how-it-works If you see an annot ...
- Microsoft source-code annotation language (SAL) 相关
More info see: https://msdn.microsoft.com/en-us/library/hh916383.aspx Simply stated, SAL is an inexp ...
- Spring 4 Ehcache Configuration Example with @Cacheable Annotation
http://www.concretepage.com/spring-4/spring-4-ehcache-configuration-example-with-cacheable-annotatio ...
- Annotation Type @bean,@Import,@configuration使用--官方文档
@Target(value={METHOD,ANNOTATION_TYPE}) @Retention(value=RUNTIME) @Documented public @interface Bean ...
- Calling convention-调用约定
In computer science, a calling convention is an implementation-level (low-level) scheme for how subr ...
随机推荐
- 1366 贫富差距(floyed)
1366 贫富差距 题目来源: TopCoder 基准时间限制:1 秒 空间限制:131072 KB 分值: 40 难度:4级算法题 一个国家有N个公民,标记为0,1,2,...,N-1,每个公民有一 ...
- 关于org.apache.shiro.SecurityUtils.getSubject().getSession()
Subject currentUser = SecurityUtils.getSubject(); Session session = currentUser.getSession(); s ...
- KVC示例
KVC –key value Coding,可以让我们通过键值编码的形式进行属性值的赋值 参考苹果官网的图.. 1.KVC 定义一个Person类 .h文件 1: #import <Founda ...
- 问题:Unable to find a 'userdata.img' file for ABI armeabi to copy into the AVD folder.
创建AVD时,发现创建不成功,报错“Unable to find a 'userdata.img' file for ABIarmeabi to copy into the AVD folder.” ...
- cmd文件和bat文件有什么区别
第一次遇到后缀是cmd的文件, 记录下与bat文件的区别 本质上没有区别,都是简单的文本编码方式,都可以用记事本创建.编辑和查看. 两者所用的命令行代码也是共用的,只是cmd文件中允许使用的命令要比b ...
- vertical-align:middle;一般用于img和行内文字对齐方式
vertical-align:top ;文字和行内块元素的顶部对齐 vertical-align:middle;居中 vertical-align:bottom;底对齐
- ERROR 1396 (HY000): Operation CREATE USER failed for 'root'@'localhost'
安装ranger时MySQL报错,查看MySQL数据库,发现host=localhost这一列被删除了,插入这一列就好了,具体操作如下: 解决办法: 进入MySQL数据库 use mysql: &qu ...
- Axure快捷键
基本快捷键: 打开:Ctrl + O 新建:Ctrl + N 保存:Ctrl + S 退出:Alt + F4 打印:Ctrl + P 查找:Ctrl + F 替换:Ctrl + H 复制:Ctrl + ...
- .NET Framework 3.5-8 下载地址
https://dotnet.microsoft.com/download/dotnet-framework Version Released End of life .NET Framework 4 ...
- R&python机器学习之朴素贝叶斯分类
朴素贝叶斯算法描述应用贝叶斯定理进行分类的一个简单应用.这里之所以称之为“朴素”,是因为它假设各个特征属性是无关的,而现实情况往往不是如此. 贝叶斯定理也称贝叶斯推理,早在18世纪,英国学者贝叶斯(1 ...