Fasterq-dump and Snakemake

A note on how to automatize public datasets fetch from NCBI using SRA toolkit and Snakemake. Here we use a config, a rule and a conda environment file. First, the Snakefile: configfile: "include/rules/config.yaml" include: "include/rules/fasterqdump.rule" rule all: input: expand("01_raw/done__{srr}_dump", srr=config['srr']) the configfile (include/rules/config.yaml) srr: - SRR12345678 and the rule file (include/rules/fasterqdump.rule): rule prefetch: output: "01_raw/.prefetch/sra/{srr}.sra" … Continue reading Fasterq-dump and Snakemake

InterProScan and Snakemake

Following up with a previous post "InterProScan and Docker", here a quick note on a InterProScan using Snakemake. The commented Snakefile: # Input fasta files with proteins sequences should be at lib/foobar.pep PEPS, = glob_wildcards("lib/{pep}.pep") # configfile path configfile: "include/rules/config.yaml" rule all: input: expand("02_interproscan/{pep}.tsv",pep=PEPS) # get/install interproscan rule install_interproscan: #https://interproscan-docs.readthedocs.io/en/latest/HowToDownload.html input: output: touch("02_interproscan/done__install_interproscan") params: "temp/" … Continue reading InterProScan and Snakemake

Download Project from Basespace

Quick note on how to download data from Basespace. For Linux users, you cannot bulk download files using web interface. The alternative is to use BaseSpace Sequence Hub CLI. First, fetch "bs" application by using: wget "https://api.bintray.com/content/basespace/BaseSpaceCLI-EarlyAccess-BIN/latest/\$latest/amd64-linux/bs?bt_package=latest" -O bs Fix permissions: chmod +X ./bs chmod 755 ./bs Then, authenticate at Basespace: ./bs auth This command … Continue reading Download Project from Basespace