Filter FASTA files
Use a regular expression for filtering sequences by id from a FASTA file, e.g. just certain chromosomes from a genome. There are other tools as part of bigger packages to install (and no regex support), mostly awk-based awkward (sorry for the pun) bash solutions, and scripts using packages that one needs to install and with still no support for regular expressions. This however is a simple, straightforward little python script for a simple task. It doesn’t do anything else and doesn’t need anything but a stock python installation. Based on the FASTA reader snippet.
Usage:
python FASTAfilter.py [-h] regex infile outfile
From a FASTA-file with multiple >entries, filter by sequence ids using a
regex.
positional arguments:
regex Regex to filter entry ids, e.g. ‘chr[1-4]’. Note that the id does not contain the initial > character.
infile A FASTA input file, usually with multiple entries.
outfile The new file with only the matching entries.
optional arguments:
-h, –help show this help message and exit
INSTALL:
cd /data/software
wget http://dm516.user.srcf.net/fastafilter/FASTAfilter.zip
unzip FASTAfilter.zip
easy_install argparse
USAGE:
python FASTAfilter.py   [1-9,10,11,12,13,14,15,16,17,18,X]  \
/dat2/INPUT.fa \
/dat2/OUTPUT.fa
Error:
Traceback (most recent call last):
  File "FASTAfilter.py", line 3, in <module>
    import argparse
ImportError: No module named argparse
Solution:
run "easy_install argparse" as root user.
http://dm516.user.srcf.net/?p=314
Filter FASTA files的更多相关文章
- Extract Fasta Sequences Sub Sets by position
		
cut -d " " -f 1 sequences.fa | tr -s "\n" "\t"| sed -s 's/>/\n/g' & ...
 - elfinder中通过DirectoryStream.Filter实现筛选隐藏目录(二)
		
今天还是没事看了看elfinder源码,发现之前说的两个版本实现都是基于不同的jdkelfinder源码浏览-Volume文件系统操作类(1), 带前端页面的是基于1.6中File实现,另一个是基于1 ...
 - OpenFileDialog.Filter 属性
		
如果 Filter 属性为 Empty,将显示所有文件. 始终显示文件夹. Filter 由以下部分组成:筛选器说明,后跟竖线 (|) 和筛选模式. 筛选器可以指定一个或多个文件类型. 说明描述了对话 ...
 - python 高阶函数之filter
		
前文说到python高阶函数之map,相信大家对python中的高阶函数有所了解,此次继续分享python中的另一个高阶函数filter. 先看一下filter() 函数签名 >>> ...
 - Falcon Genome Assembly Tool Kit Manual
		
Falcon Falcon: a set of tools for fast aligning long reads for consensus and assembly The Falcon too ...
 - Linux command line exercises for NGS data processing
		
by Umer Zeeshan Ijaz The purpose of this tutorial is to introduce students to the frequently used to ...
 - 构建NCBI本地BLAST数据库 (NR NT等) | blastx/diamond使用方法 | blast构建索引 | makeblastdb
		
参考链接: FTP README 如何下载 NCBI NR NT数据库? 下载blast:ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+ 先了解 ...
 - STAR manual
		
来源:STARmanual.pdf 来源:Calling variants in RNAseq PART0 准备工作 #STAR 安装前的依赖的工具 #Red Hat, CentOS, Fedora. ...
 - <二代測序> 下载 NCBI sra 文件
		
本文近期更新地址: http://blog.csdn.net/tanzuozhev/article/details/51077222 随着測序技术的不断提高.二代測序数据成指数增长. NCBI提供了S ...
 
随机推荐
- iOS 计算时间差
			
/** * 计算指定时间与当前的时间差 * @param compareDate 某一指定时间 * @return 多少(秒or分or天or月or年)+前 (比如,3天前.10分钟前) */ +(NS ...
 - IOS 计算本周的起至日期
			
unsigned units=NSMonthCalendarUnit|NSDayCalendarUnit|NSYearCalendarUnit|NSWeekdayCalendarUnit; NSCal ...
 - [LintCode] 带最小值操作的栈
			
class MinStack { public: MinStack() { // do initialization if necessary } void push(int number) { // ...
 - windosw启动redis
			
1.cmd控制台 cd C:\Program Files\Redis 2.redis-server.exe redis.windows.conf 3. ok!!
 - mysql 获取id最大值
			
数据库表中id列不为自动增加,需要程序来增加id的SQL SELECTCASE IFNULL(MAX(id),1)WHEN 1 THEN 1ELSE MAX(id) + 1END AS newmaxi ...
 - yum -y install epel-release
			
EPEL - Fedora Project Wiki https://fedoraproject.org/wiki/EPEL
 - Struts2 框架的值栈
			
1. OGNL 表达式 1.1 概述 OGNL(Object Graphic Navigation Language),即对象图导航语言; 所谓对象图,即以任意一个对象为根,通过OGNL可以访问与这个 ...
 - 004-shiro简介
			
一.什么是shiro shiro是apache的一个开源框架,是一个权限管理的框架,实现 用户认证.用户授权. spring中有spring security (原名Acegi),是一个权限框架,它和 ...
 - Linux环境配置全局jdk和局部jdk并生效
			
全局jdk配置: 1.root用户登录 2.进入opt目录,新建java文件夹 cd /opt mkdir java 上传jdk7u79linuxx64.tar.gz包到java文件夹并解压 jd ...
 - 关于source insight、添加.s和.S文件,显示全部路径、加入项目后闪屏幕
			
1.source insight使用也有一年多时间了,今天出现建工程后添加文件“no files found” 百思不得姐: 后面发现是原工程命名时出现非法字符.重新命名就ok了. 切记切记 2.实用 ...