R, Bioconductor

filterVcf: Extract Variants of Interest from a Large VCF File (Paul Shannon)

We demonstrate three methods:  filtering by genomic region,  filtering on attributes of
each specific variant call, and intersecting with known regions of interest (exons, splice
sites, regulatory regions, etc.).

http://www.bioconductor.org/packages/release/bioc/vignettes/VariantAnnotation/inst/doc/filterVcf.pdf

Java

SelectVariants -- Select a subset of variants from a larger callset ( GATK SelectVariants )

Often, a VCF containing many samples and/or variants will need to be subset in order to facilitate certain analyses (e.g. comparing and contrasting cases vs. controls; extracting variant or non-variant loci that meet certain requirements, displaying just a few samples in a browser like IGV, etc.). SelectVariants can be used for this purpose.

https://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_gatk_tools_walkers_variantutils_SelectVariants.php

Biostars

Question: How To Split Multiple Samples In Vcf File Generated By Gatk?
I did variant calling using BWA + PiCard + GATK and have just got the filtered VCF files from GATK. In the process of running GATK, I used list of inputs (11 samples) and for most steps, I had only one output file for each step. Now, I got two VCF files (one for SNPs and the other is for indels), each of which contains 11 samples. I can see the names of the 11 samples in the header of vcf files, and each sample seems to have one column of data. So I am wondering how to split each VCF files into individual sample vcf files?

https://www.biostars.org/p/78929/

bcftools

for file in *.vcf*; do
for sample in `bcftools view -h $file | grep "^#CHROM" | cut -f10-`; do
bcftools view -c1 -Oz -s $sample -o ${file/.vcf*/.$sample.vcf.gz} $file
done
done

https://www.biostars.org/p/12535/#115691

vcf-subset

vcf-subset -c S1 bigfile.vcf > S1.vcf

https://www.biostars.org/p/78929/

http://campagnelab.org/software/goby/reference-documentation/modes/vcf-subset/

REF:

http://samtools.github.io/hts-specs/VCFv4.2.pdf

Extracting info from VCF files的更多相关文章

  1. 将vcf文件转化为plink格式并且保持phasing状态

    VCFtools can convert VCF files into formats convenient for use in other programs. One such example i ...

  2. 【Bcftools】合并不同sample的vcf文件,通过bcftools

    通过GATK calling出来的SNP如果使用UnifiedGenotype获得的SNP文件是分sample的,但是如果使用vcftools或者ANGSD则需要Vcf文件是multi-sample的 ...

  3. iCloud无法导入vCard问题。fix the error when import vcard/vcf to icloud.

    问题描述:当登录icloud.com,进入通讯录的时候,导入VCF格式的联系人的时候会报错.如图: 1.从outlook的联系人中选一个联系人,导出联系人卡片-vCard文件 (如果是塞班手机,可以用 ...

  4. Linux command line exercises for NGS data processing

    by Umer Zeeshan Ijaz The purpose of this tutorial is to introduce students to the frequently used to ...

  5. Awesome C/C++

    Awesome C/C++ A curated list of awesome C/C++ frameworks, libraries, resources, and shiny things. In ...

  6. autodock 结果pdb的生成

    Is there a way to save a protein-ligand complex as a PDB file in AutoDock? I have completed my docki ...

  7. Gumshoe - Microsoft Code Coverage Test Toolset

    Gumshoe - Microsoft Code Coverage Test Toolset 2014-07-17 What is Gumshoe? How to instrument a binar ...

  8. C/C++ 框架,类库,资源集合

    很棒的 C/C++ 框架,类库,资源集合. Awesome C/C++ Standard Libraries Frameworks Artificial Intelligence Asynchrono ...

  9. awesome cpp

    https://github.com/fffaraz/awesome-cpp Awesome C/C++ A curated list of awesome C/C++ frameworks, lib ...

随机推荐

  1. docker 中安装 FastDFS 总结

    如题,参考各资料后,安装FastDFS总结.基于已有docker镜像 https://hub.docker.com/r/luhuiguo/fastdfs/ docker pull luhuiguo/f ...

  2. poj2046

    Gap Time Limit: 5000MS   Memory Limit: 65536K Total Submissions: 1829   Accepted: 829 Description Le ...

  3. shopxx----权限添加

    shiro权限控制 一.角色管理请求地址 1.模板地址 /shopxx/WebContent/WEB-INF/template/admin/role/list.ftl shiro 权限拦截 配置安全管 ...

  4. [hihoCoder] 拓扑排序·一

    The hints of the problem has given detailed information to implement the topological sorting algorit ...

  5. 隐藏内容但仍保持占位的css写法

    通常显示和隐藏内容都会用display:block;和display:none; 如果想要保持内容的占位可以用visbility:visible; 和visiblity:hidden;来控制内容的显示 ...

  6. 一起做RGB-D SLAM调试

    最近在学习高博的一起做RGB-D SLAM第一版本,其中调试出现了挺多问题,百度查找许多资料, 最后调通所有程序,记录以下运行环境. 高博一起做RGB-D SLAM系列主页: http://www.c ...

  7. 动态长度中英字符串显示至固定高度td

    w 为td中英字符串区域设置为display:block; height=td_height,并指明td width. <!doctype html> <html lang=&quo ...

  8. 第12章—整合Redis

    spring boot 系列学习记录:http://www.cnblogs.com/jinxiaohang/p/8111057.html 码云源码地址:https://gitee.com/jinxia ...

  9. IO 流中编码和解码问题

    编码表 ASCII : American Standard Code for Information Interchange 使用一个字节的 7 位可以表示 ISO8859-1 : 拉丁码表. 欧洲码 ...

  10. Linux基础命令(一)

    一.开启Linux操作系统,要求以root用户登录GNOME图形界面,语言支持选择为汉语没有图形界面 二.使用快捷键切换到虚拟终端2,使用普通用户身份登录,查看系统提示符 Ctrl + Alt + F ...