HTSeq:一个用于处理高通量数据(High-throughout sequencing)的python包。
HTSeq包有很多功能类,熟悉python脚本的可以自行编写数据处理脚本。
另外,HTSeq也提供了两个脚本文件能够直接处理数据:htseq-qa(检测数据质量)和htseq-count(reads计数)。

用法:htseq-count [options] <alignment_file> <gff_file>

<alignment_file> :

contains the aligned reads in the SAM format.

Make sure to use a splicing-aware aligner such as TopHat.

To read from standard input, use - as <alignment_file>.

{options}

-f <format>--format=<format>

Format of the input data. Possible values are sam (for text SAM files) and bam (for binary BAM files). Default is sam.

-r <order>--order=<order>

For paired-end data, the alignment have to be sorted either by read name or by alignment position. If your data is not sorted, use the samtools sort function of samtools to sort it. Use this option, with name or pos for <order> to indicate how the input data has been sorted. The default is name.

If name is indicated, htseq-count expects all the alignments for the reads of a given read pair to appear in adjacent records in the input data. For pos, this is not expected; rather, read alignments whose mate alignment have not yet been seen are kept in a buffer in memory until the mate is found. While, strictly speaking, the latter will also work with unsorted data, sorting ensures that most alignment mates appear close to each other in the data and hence the buffer is much less likely to overflow.

-s <yes/no/reverse>--stranded=<yes/no/reverse>

whether the data is from a strand-specific assay (default: yes)

For stranded=no, a read is considered overlapping with a feature regardless of whether it is mapped to the same or the opposite strand as the feature. For stranded=yes and single-end reads, the read has to be mapped to the same strand as the feature. For paired-end reads, the first read has to be on the same strand and the second read on the opposite strand. For stranded=reverse, these rules are reversed.

If your RNA-Seq data has not been made with a strand-specific protocol, this causes half of the reads to be lost. Hence, make sure to set the option --stranded=no unless you have strand-specific data!

-a <minaqual>--a=<minaqual>

skip all reads with alignment quality lower than the given minimum value (default: 10 — Note: the default used to be 0 until version 0.5.4.)

-t <feature type>--type=<feature type>

feature type (3rd column in GFF file) to be used, all features of other type are ignored (default, suitable for RNA-Seq analysis using an Ensembl GTF file: exon)

-i <id attribute>--idattr=<id attribute>

GFF attribute to be used as feature ID. Several GFF lines with the same feature ID will be considered as parts of the same feature. The feature ID is used to identity the counts in the output table. The default, suitable for RNA-Seq analysis using an Ensembl GTF file, is gene_id.

-m <mode>--mode=<mode>

Mode to handle reads overlapping more than one feature. Possible values for <mode> are unionintersection-strict and intersection-nonempty(default: union)

-o <samout>--samout=<samout>

write out all SAM alignment records into an output SAM file called <samout>, annotating each line with its assignment to a feature or a special counter (as an optional field with tag ‘XF’)

-q--quiet

suppress progress report and warnings

HTseq-count的更多相关文章

  1. nodejs api 中文文档

    文档首页 英文版文档 本作品采用知识共享署名-非商业性使用 3.0 未本地化版本许可协议进行许可. Node.js v0.10.18 手册 & 文档 索引 | 在单一页面中浏览 | JSON格 ...

  2. C#中Length和Count的区别(个人观点)

    这篇文章将会很短...短到比你的JJ还短,当然开玩笑了.网上有说过Length和count的区别,都是很含糊的,我没有发现有 文章说得比较透彻的,所以,虽然这篇文章很短,我还是希望能留在首页,听听大家 ...

  3. [PHP源码阅读]count函数

    在PHP编程中,在遍历数组的时候经常需要先计算数组的长度作为循环结束的判断条件,而在PHP里面对数组的操作是很频繁的,因此count也算是一个常用函数,下面研究一下count函数的具体实现. 我在gi ...

  4. EntityFramework.Extended 实现 update count+=1

    在使用 EF 的时候,EntityFramework.Extended 的作用:使IQueryable<T>转换为update table set ...,这样使我们在修改实体对象的时候, ...

  5. 学习笔记 MYSQL报错注入(count()、rand()、group by)

    首先看下常见的攻击载荷,如下: select count(*),(floor(rand(0)*2))x from table group by x; 然后对于攻击载荷进行解释, floor(rand( ...

  6. count(*) 与count (字段名)的区别

    count(*) 查出来的是:结果集的总条数 count(字段名) 查出来的是: 结果集中'字段名'不为空的记录的总条数

  7. BZOJ 2588: Spoj 10628. Count on a tree [树上主席树]

    2588: Spoj 10628. Count on a tree Time Limit: 12 Sec  Memory Limit: 128 MBSubmit: 5217  Solved: 1233 ...

  8. [LeetCode] Count Numbers with Unique Digits 计算各位不相同的数字个数

    Given a non-negative integer n, count all numbers with unique digits, x, where 0 ≤ x < 10n. Examp ...

  9. [LeetCode] Count of Range Sum 区间和计数

    Given an integer array nums, return the number of range sums that lie in [lower, upper] inclusive.Ra ...

  10. [LeetCode] Count of Smaller Numbers After Self 计算后面较小数字的个数

    You are given an integer array nums and you have to return a new counts array. The counts array has ...

随机推荐

  1. php 批量插入字段

    foreach ($_POST as $key => $value){ $array[] = "add ".$key." varchar(220),"; ...

  2. PHP实现自己活了多少岁

    1.mktime()函数的功能 2.代码: $birth = mktime(0,0,0,10,2,1992);//出生的时间戳 $time = time();//当前的时间戳 $age = floor ...

  3. mongo 过滤查询条件后分组、排序

    描述:最近业主有这么一个需求,根据集合中 时间段进行过滤,过滤的时间时间段为日期类型字符串,需要根据某一日期进行截取后.进行分组,排序 概述题目:根据createTime时间段做查询,然后以 天进行分 ...

  4. PHPStorm2017去掉参数提示 parameter name hints

    JetBrains 的各种语言的 IDE 都灰常灰常好用, 个个都是神器, PHPStorm 作为PHP开发的神器也不必多说了 今天升级到 PHPStorm 2017.1 发现增加了好些新功能, 有个 ...

  5. Minimal Steiner Tree ACM

    上图论课的时候无意之间看到了这个,然后花了几天的时间学习了下,接下来做一个总结. 一般斯坦纳树问题是指(来自百度百科): 斯坦纳树问题是组合优化问题,与最小生成树相似,是最短网络的一种.最小生成树是在 ...

  6. codevs1044 拦截导弹==洛谷 P1020 导弹拦截

    P1020 导弹拦截 题目描述 某国为了防御敌国的导弹袭击,发展出一种导弹拦截系统.但是这种导弹拦截系统有一个缺陷:虽然它的第一发炮弹能够到达任意的高度,但是以后每一发炮弹都不能高于前一发的高度.某天 ...

  7. java线程:Atomic(原子)

    .何谓Atomic? Atomic一词跟原子有点关系,后者曾被人认为是最小物质的单位.计算机中的Atomic是指不能分割成若干部分的意思.如果一段代码被认为是Atomic,则表示这段代码在执行过程中, ...

  8. Google 翻译如何获取 tk 参数值?

    1.首先获取 TKK 参数,这个参数可以在 https://translate.google.com 网页获取, src:TKK=eval('((function(){var a\x3d2089517 ...

  9. Nuxt使用scss

    Nuxt中使用scss也很简单,分简单的几步就OK 一.安装scss依赖 用IDE打开项目,在Terminal里通过 npm i node-sass sass-loader scss-loader - ...

  10. MySQL中InnoDB脏页刷新机制Checkpoint

    我们知道InnoDB采用Write Ahead Log策略来防止宕机数据丢失,即事务提交时,先写重做日志,再修改内存数据页,这样就产生了脏页.既然有重做日志保证数据持久性,查询时也可以直接从缓冲池页中 ...