Use a regular expression for filtering sequences by id from a FASTA file, e.g. just certain chromosomes from a genome. There are other tools as part of bigger packages to install (and no regex support), mostly awk-based awkward (sorry for the pun) bash solutions, and scripts using packages that one needs to install and with still no support for regular expressions. This however is a simple, straightforward little python script for a simple task. It doesn’t do anything else and doesn’t need anything but a stock python installation. Based on the FASTA reader snippet.

Download here.

Usage:

python FASTAfilter.py [-h] regex infile outfile

From a FASTA-file with multiple >entries, filter by sequence ids using a
regex.

positional arguments:
regex Regex to filter entry ids, e.g. ‘chr[1-4]’. Note that the id does not contain the initial > character.
infile A FASTA input file, usually with multiple entries.
outfile The new file with only the matching entries.

optional arguments:
-h, –help show this help message and exit

INSTALL:

cd /data/software
wget http://dm516.user.srcf.net/fastafilter/FASTAfilter.zip
unzip FASTAfilter.zip
easy_install argparse

USAGE:

python FASTAfilter.py   [1-9,10,11,12,13,14,15,16,17,18,X]  \
/dat2/INPUT.fa \
/dat2/OUTPUT.fa

Error:

Traceback (most recent call last):
  File "FASTAfilter.py", line 3, in <module>
    import argparse
ImportError: No module named argparse

Solution:

run "easy_install argparse" as root user.

http://dm516.user.srcf.net/?p=314

Filter FASTA files的更多相关文章

  1. Extract Fasta Sequences Sub Sets by position

    cut -d " " -f 1 sequences.fa | tr -s "\n" "\t"| sed -s 's/>/\n/g' & ...

  2. elfinder中通过DirectoryStream.Filter实现筛选隐藏目录(二)

    今天还是没事看了看elfinder源码,发现之前说的两个版本实现都是基于不同的jdkelfinder源码浏览-Volume文件系统操作类(1), 带前端页面的是基于1.6中File实现,另一个是基于1 ...

  3. OpenFileDialog.Filter 属性

    如果 Filter 属性为 Empty,将显示所有文件. 始终显示文件夹. Filter 由以下部分组成:筛选器说明,后跟竖线 (|) 和筛选模式. 筛选器可以指定一个或多个文件类型. 说明描述了对话 ...

  4. python 高阶函数之filter

    前文说到python高阶函数之map,相信大家对python中的高阶函数有所了解,此次继续分享python中的另一个高阶函数filter. 先看一下filter() 函数签名 >>> ...

  5. Falcon Genome Assembly Tool Kit Manual

    Falcon Falcon: a set of tools for fast aligning long reads for consensus and assembly The Falcon too ...

  6. Linux command line exercises for NGS data processing

    by Umer Zeeshan Ijaz The purpose of this tutorial is to introduce students to the frequently used to ...

  7. 构建NCBI本地BLAST数据库 (NR NT等) | blastx/diamond使用方法 | blast构建索引 | makeblastdb

    参考链接: FTP README 如何下载 NCBI NR NT数据库? 下载blast:ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+ 先了解 ...

  8. STAR manual

    来源:STARmanual.pdf 来源:Calling variants in RNAseq PART0 准备工作 #STAR 安装前的依赖的工具 #Red Hat, CentOS, Fedora. ...

  9. &lt;二代測序&gt; 下载 NCBI sra 文件

    本文近期更新地址: http://blog.csdn.net/tanzuozhev/article/details/51077222 随着測序技术的不断提高.二代測序数据成指数增长. NCBI提供了S ...

随机推荐

  1. Android无线测试之—UiAutomator UiObject API介绍四

    输入文本与清除文本 一.输入文本与清除文本相关API 返回值 API 描述 boolean setText(String test) 在对象中输入文本 void clearTextField() 清除 ...

  2. 《转》Win7 IIS7.5 安装

    一.进入Win7的 控制面板,选择左侧的 打开或关闭Windows功能. 二.现在出现了安装Windows功能的选项菜单,注意选择的项目,我们需要手动选择需要的功能,下面这张图片把需要安装的服务都已经 ...

  3. Sublime text 3 快捷键:

    Ctrl+Shift+[  选中代码,按下快捷键,折叠代码. Ctrl+Shift+]  选中代码,按下快捷键,展开代码.  Ctrl+Shift+P:打开命令面板 Ctrl+P:打开搜索框,搜索项目 ...

  4. 用 Stellar.js 制作视差滚动效果

    参考 http://doc.bropaul.com/Stellar.js/docs/ https://github.com/markdalgleish/stellar.js#download http ...

  5. MD5-【验签】

    MD5是什么? MD5是message-digest algorithm 5(信息-摘要算法)的缩写,被广泛用于加密和解密技术上,它可以说是文件的"数字指纹".任何一个文件,无论是 ...

  6. ubuntu搭建java web环境

    java web环境即jdk+tomcat+mysql jdk:http://www.oracle.com/technetwork/java/javase/downloads/index.html t ...

  7. Kubernetes初探:原理及实践应用

    总体概览 如下图所示是我初步阅读文档和源代码之后整理的总体概览,基本上可以从如下三个维度来认识Kubernetes. 操作对象 Kubernetes以RESTFul形式开放接口,用户可操作的REST对 ...

  8. Grammar Rules

    Grammar Rules Here are 20 simple rules and tips to help you avoid mistakes in English grammar. For m ...

  9. protoc-gen-go: error:bad Go source code was generated: 163:6: illegal UTF-8 encoding (and 2915 more errors)

    protoc-gen-go: error:bad Go source code was generated: 163:6: illegal UTF-8 encoding (and 2915 more ...

  10. 1 duilib 自绘标题 最大化图标显示bug ----WindowImplBase的bug

    窗口最大化之后有两个问题,     1.最大化按钮的样式还是没变,正确的样式应该是这样的     2.再次点击最大化按钮,不能还原到正常大小.     这个是WindowImplBase的bug,已经 ...