The Biostar Handbook Data Site

Book: http://read.biostarhandbook.com

Index: http://data.biostarhandbook.com

wonderfetch

The Entrez Direct service is sometimes nonfunctional. We wrote a very simple replacement script that can fetch data in case efetch does not work.

Install wonderfetch

mkdir -p ~/bin
curl http://data.biostarhandbook.com/scripts/wonderfetch.sh > ~/bin/wonderfetch
chmod +x ~/bin/wonderfetch

Using wonderfetch

wonderfetch database accession format

for example:

wonderfetch nucleotide AF086833 fasta

wonderdump

As it happens the internal implementation of fastq-dump precludes it from working on Bash for Windows. Use a helper script that we have created to bypass this limitation. We call it the wonderdump:

Install wonderdump

curl http://data.biostarhandbook.com/scripts/wonderdump.sh > ~/bin/wonderdump
chmod +x ~/bin/wonderdump

Using wonderdump

Then enjoy access to the SRA files where instead of writing:

fastq-dump -X 10000 --split-files $SRR

you'll need to write:

 wonderdump SRR1972739 -X 10000 --split-files

There is a catch though! Note how the SRR number must be the first parameter instead of last!

To download batches of SRA ids put them into a file then run:

cat samples.txt | xargs -n 1 echo wonderdump | bash

geodata

A simple script to fetch information from the Gene Expression Omnibus

Install geodata.sh

curl http://data.biostarhandbook.com/scripts/geodata.sh > ~/bin/geodata
chmod +x ~/bin/geodata

Usage

geodata GSE78711