page

Nov 9, 2015

How To Download All Sra Samples At Once

using SQLlite3
https://edwards.sdsu.edu/research/getting-data-from-the-sra/



using linux wget 
details in http://seqanswers.com/forums/archive/index.php/t-30625.html

wget -r -nd -nH ftp://file_address

for example,
wget -r -nd -nH ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByStudy/sra/SRP/SRP103/SRP103124

Address is changed : edit 20210720
https://sra-pub-run-odp.s3.amazonaws.com/sra/SRR11780909/SRR11780909


To download specific SRA files in the list

Store the SRR096001-SRR096999 #s that would want to download in a file:
for example:
cat > SRR_2_download
SRR096023
SRR096072
SRR096074

for i in $(cat SRR_2_download);do wget -r -nd -nH ftp://ftp-trace.ncbi.nlm.nih.gov/sra/sra-instant/reads/ByRun/litesra/SRR/SRR096/$i/*; done

for i in $(cat SRR_2_download);do wget https://sra-pub-run-odp.s3.amazonaws.com/sra/$i/$i; done

 
This will download the specific SRA files that are listed in the file SRR_2_download

using R
details in www.biostars.org/p/93494/

Run R
source('http://bioconductor.org/biocLite.R')
biocLite('SRAdb')
library(SRAdb)
biocLite('DBI')
library(DBI)

srafile = getSRAdbFile() # Download & Unzip Last Version Of SRAmetadb.Sqlite.Gz From Server to working directory

# Once you download SRAmetadb.Sqlite.Gz, set SRAmetadb.sqlite file to variable srafile
# SRAmetadb.Sqlite.Gz is big. Re-download if you need updated version

srafile <- 'SRAmetadb.sqlite' 

con = dbConnect(RSQLite::SQLite(), srafile)
listSRAfile('SRP026197',con)
getSRAfile('SRP026197',con,fileType='sra')




## dump SRA file to fastq.gz  (required SRAtoolkit)
Run in linux
fastq-dump -O /output/dir --gzip ./SRR2047462.sra #sra file dump to SRR2047462.fastq.gz


library(GEOquery)
gse <- getGEO('GSE48138') # retrieves a GEO list set for your SRA id.
## see what is in there:
show(gse)
# There are 2 sets of samples for that ID
##  what you want is table a with SRR to download and some sample information:
## lets see what the first set contains:
df <- as.data.frame(gse[[1]])
head(df)