HowTo: Access SRA Data
https://github.com/ncbi/sra-tools/wiki/HowTo:-Access-SRA-Data
use the tool prefetch
included in the SRA Toolkit.
xample of prefetch
usage:
$ prefetch SRR1482462
Maximum file size download limit is 20,971,520KB
2015-02-19T13:20:06 prefetch.2.4.4: 1) Downloading 'SRR1482462'...
2015-02-19T13:20:06 prefetch.2.4.4: Downloading via fasp...
2015-02-19T13:20:32 prefetch.2.4.4: fasp download succeed
2015-02-19T13:20:32 prefetch.2.4.4: 1) 'SRR1482462' was downloaded successfully
2015-02-19T13:20:35 prefetch.2.4.4: 'SRR1482462' has 22 dependencies
2015-02-19T13:20:36 prefetch.2.4.4: 2) Downloading 'ncbi-acc:NC_000067.5?vdb-ctx=refseq'...
2015-02-19T13:20:36 prefetch.2.4.4: Downloading via fasp...
2015-02-19T13:20:41 prefetch.2.4.4: fasp download succeed
2015-02-19T13:20:41 prefetch.2.4.4: 2) 'ncbi-acc:NC_000067.5?vdb-ctx=refseq' was downloaded successfully
2015-02-19T13:20:41 prefetch.2.4.4: 3) Downloading 'ncbi-acc:NC_000068.6?vdb-ctx=refseq'...
2015-02-19T13:20:41 prefetch.2.4.4: Downloading via fasp...
2015-02-19T13:20:46 prefetch.2.4.4: fasp download succeed
2015-02-19T13:20:46 prefetch.2.4.4: 3) 'ncbi-acc:NC_000068.6?vdb-ctx=refseq' was downloaded successfully
2015-02-19T13:20:46 prefetch.2.4.4: 4) Downloading 'ncbi-acc:NC_000069.5?vdb-ctx=refseq'...
2015-02-19T13:20:46 prefetch.2.4.4: Downloading via fasp...
2015-02-19T13:20:51 prefetch.2.4.4: fasp download succeed
2015-02-19T13:20:51 prefetch.2.4.4: 4) 'ncbi-acc:NC_000069.5?vdb-ctx=refseq' was downloaded successfully
...
As can be seen from the output above, prefetch
performs several steps:
-
check the size of the file being downloaded
If the file is very large,prefetch
must be given a higher download limit, e.g.:
$ prefetch --max-size 100000000 SRR1482462
-
download the requested file
The file is downloaded using Aspera if available on your system, or HTTPS otherwise. -
put the file into its proper place
The file is downloaded into your designated cache area. This permits VDB name resolution to work as designed. -
recursively download missing external reference sequences
Most SRA files require additional sequence files in order to reconstruct original reads.prefetch
ensures that you not only download the main file but all of its dependencies. -
access dbGaP encrypted data
prefetch
will make use of download and decryption keys that have been added to SRA Toolkit configuration to obtain authorization for the download in addition to performing all of the steps above. (N.B. In order to access dbGaP data, you will need to change directory or "cd" to the dbGaP project's workspace.)
prefetch
will also operate on existing, previously downloaded files to recursively download any missing external reference sequences.
No comments:
Post a Comment