Downloading and installing the SRA Toolkit

step1: 下载并安装SRAtoolkit    (Download the Toolkit from the SRA website)

  1. If you are using a web browser, the following page contains download links to the most current version of the toolkit for each of the supported platforms: SRA Toolkit download page: https://www.ncbi.nlm.nih.gov/Traces/sra/?view=software
  2. If you are instead working from a command line interface, you may use FTP or wget to obtain the software from the following directory: "ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sdk/current". Example:
    wget "ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sdk/current/sratoolkit.current-centos_linux64.tar.gz"

step2 :解压SRA toolkit   (Unpack the Toolkit:)

  1. For Linux, use tar:

    tar -xzf sratoolkit.current-centos_linux64.tar.gz
  2. For Mac OS X, double-click on the .tar.gz file and the Archive Utility will unpack it. Alternatively, command-line tar will also work (see Linux example, above).
  3. For Windows, either use an archiving and compression utility (e.g., Winzip, 7-Zip, etc.), or simply double-click on the .zip file and drag the 'sratoolkit...' folder to the preferred install location.

注解压后:

需要进入  bin路径下

Note: For most users, the Toolkit functions (fastq-dump, sam-dump, etc.) will not be located in their PATH environmental variable. This may require providing directory information about the location of the Toolkit. See the below examples for how 'fastq-dump' would be called in different circumstances:

  • ~/[user_name]/sra-toolkit/fastq-dump

    YES: The Toolkit "bin" directory has been placed in the user-specified directory "sra-toolkit"

  • ./fastq-dump

    YES: The Toolkit components are the in the current working directory

  • fastq-dump

    NO: If the toolkit location is not specified in your $PATH variable, then the OS cannot locate the fastq-dump program, even if it is in the current directory. NOTE: Windows users should be able to enter only "fastq-dump.exe" if you have navigated to the Toolkit "bin" directory.

Testing the Toolkit configuration

The Toolkit comes with a default configuration that will work for most users. You may elect to perform the following tests to confirm that your configuration is working correctly. The default location for the "download repository" is:

  • Linux: /home/[user_name]/ncbi/public
  • Mac OS X: /Users/[user_name]/ncbi/public
  • Windows: C:\Users\[user_name]\ncbi\public

Note that if the tests fail, or if you wish to specify the download location for files sourced from NCBI, you should configure your Toolkit installation. During normal operation, the Toolkit may be required to download the following types of data to the default location:

  • Reference sequences: Small (most less than 70 MB) sequences used to decompress aligned SRA data.
  • SRA data files: If data are downloaded "on-demand" using the toolkit, then partial and whole SRA datasets (most are several Gb in size) can be located here. Note: Manually downloaded SRA data obtained using a web browser, wget, ascp, or FTP may be stored anywhere in the local file system.

For the test, we are using an arbitrary dataset, SRR390728 (RNA-Seq (polyA+) analysis of DLBCL cell line HS0798), from the National Cancer Institute’s Cancer Genome Characterization Initiative (CGCI) Project. It is a reasonably small SRA dataset that contains aligned (reference-compressed) data, allowing us to test multiple aspects of the toolkit simultaneously.

  1. Open a terminal or command prompt and "cd" into the directory containing the toolkit executables (e.g., [download_location]/sratoolkit[version]/bin/).

    • Linux and OS X users should execute the following command:

      ./fastq-dump -X 5 -Z SRR390728
    • Windows users should execute the following command:
      fastq-dump.exe -X 5 -Z SRR390728
  2. If successful, the test should connect to NCBI, download a small amount of data from SRR390728 and the reference sequence needed to extract the data, and stream the first 5 spots of the file ("-X 5" option) to the screen ("-Z" option).
  3. If the configuration is not valid, an error like the following will likely be displayed:
    fastq-dump.2.x err: item not found while constructing within virtual database module - the path 'SRR390728' cannot be opened as database or table"
  4. If you receive an error like the one above, please configure the toolkit (described in the next section). If you have already configured the toolkit but are still unable to complete the test successfully, please email sra-tools@ncbi.nlm.nih.gov with a full description of steps taken and error messages received.

SRA数据转成fastq的更多相关文章

  1. NCBI SRA数据预处理

    SRA数据的的处理流程大概如下 一.SRA数据下载. NCBI 上存储的数据现在大都存储为SRA格式. 下载以后就是以SRA为后缀名. 这里可以通过三种方式下载SRA格式的数据. 1.通过http方式 ...

  2. 用R包来下载sra数据

    1)介绍 我们用SRAdb library来对SRA数据进行处理. SRAdb 可以更方便更快的接入  metadata associated with submission, 包括study, sa ...

  3. xml格式的数据转化成数组

    将得到的xml格式的数据转化成数组 <?php //构造xml $url = "http://api.map.baidu.com/telematics/v3/weather?locat ...

  4. 将数据转化成字符串时:用字符串的链接 还是 StringBuilder

    /* 目的:将数据转化成字符串时:用字符串的链接 还是 StringBuilder呢? */ public class Test{ public static void main(String[] a ...

  5. jQuery操作列表数据转成Json再输出为html dom树

    jQuery 把列表数据转成Json再输出为如下 dom树 <div id="menu" class="lv1"> <ul class=&qu ...

  6. SpringMVC中出现" 400 Bad Request "错误(用@ResponseBody处理ajax传过来的json数据转成bean)的解决方法

    最近angularjs post到后台 400一头雾水 没有任何错误. 最后发现好文,感谢作者 SpringMVC中出现" 400 Bad Request "错误(用@Respon ...

  7. Oracle一列的多行数据拼成一行显示字符

    Oracle一列的多行数据拼成一行显示字符   oracle 提供了两个函数WMSYS.WM_CONCAT 和 ListAgg函数.    www.2cto.com   先介绍:WMSYS.WM_CO ...

  8. 使用Notepad++将多行数据合并成一行

    1.按Ctrl+F,弹出“替换”的窗口: 2.选择“替换”菜单: 3.“查找目标”内容输入为:\r\n: 4.“替换为”内容为空: 5.“查找模式”选择为正则表达式: 6.设置好之后,点击“全部替换” ...

  9. 使用gfortran将数据写成Grads格式的代码示例

    使用gfortran将数据写成Grads格式的代码示例: !-----'Fortran4Grads.f90' program Fortran4Grads implicit none integer,p ...

随机推荐

  1. BZOJ - 1036 树的统计Count (树链剖分+线段树)

    题目链接 #include<bits/stdc++.h> using namespace std; typedef long long ll; ,inf=0x3f3f3f3f; ],mx[ ...

  2. 畅通工程再续 (kruskal算法的延续)

    个人心得:这题其实跟上一题没什么区别,自己想办法把坐标啥的都给转换为对应的图形模样就好了 相信大家都听说一个“百岛湖”的地方吧,百岛湖的居民生活在不同的小岛中,当他们想去其他的小岛时都要通过划小船来实 ...

  3. ecshop彻底去版权把信息修改成自己的全教程

    前台部分: 一.去掉头部title部分的ECSHOP演示站-Powered by ecshop 1.问题:“ECSHOP演示站”方法:在后台商店设置 – 商店标题修改2.问题:“ Powered by ...

  4. 使用swing构建一个界面(包含flow ,Border,Grid,card ,scroll布局)

    package UI; import java.awt.BorderLayout;import java.awt.CardLayout;import java.awt.Cursor;import ja ...

  5. [转]Cache-Control max-age=0

    Cache-Control max-age=0   Cache-Control  no-cache — 强制每次请求直接发送给源服务器,而不经过本地缓存版本的校验.这对于需要确认认证应用很有用(可以和 ...

  6. 媒体查询ipad,pc端

    媒体查询 /* 判断ipad */ @media only screen and (min-device-width : 768px) and (max-device-width : 1024px){ ...

  7. linux find -regex 使用正则表达式

    find之强大毋庸置疑,此处只是带领大家一窥find门径,更详细的说明见man  find和 info find.整篇文章循序渐进,从最常用的文件名测试项开始步步深入,到第六节基本讲完find处理文件 ...

  8. Azure Managed Disk

    Azure的磁盘存储是保存在存储账户中的Page Blob.由于Azure Storage Account的各种限制,在设计VM的磁盘存储时,要符合Azure磁盘存储账户的最佳实践,请参考:http: ...

  9. Swing编程练习。可能这篇会有错误哦

    总结:21岁的思思是华为的初级女java工程师,我等女流怎么办呢? Swing.图形用户界面的编程,panel起了很大作用 package com.da; import java.awt.Color; ...

  10. DBUtils使用BeanListHandler及BeanHandler时返回null

    一.使用Bean相关方法时返回null 问题描述: 使用DBUtils查询数据,如果使用ArrayListHandler等都能够返回正确值,但使用BeanListHandler 和 BeanHandl ...