Skip to content

SPAdes

SPAdes is an assembly toolkit containing various assembly pipelines.

SPAdes is available as a module on Apocrita.

Usage

To run the default installed version of SPAdes, simply load the spades module:

$ module load spades
$ spades.py --help

SPAdes genome assembler <VERSION>

Usage: spades.py [options] -o <output_dir>

Basic options:
  -o <output_dir>             directory to store all the resulting files (required)
  --isolate                   this flag is highly recommended for high-coverage isolate and multi-cell data
  --sc                        this flag is required for MDA (single-cell) data
  --meta                      this flag is required for metagenomic data
  --bio                       this flag is required for biosyntheticSPAdes mode
  --corona                    this flag is required for coronaSPAdes mode
  --rna                       this flag is required for RNA-Seq data
  --plasmid                   runs plasmidSPAdes pipeline for plasmid detection
  --metaviral                 runs metaviralSPAdes pipeline for virus detection
  --metaplasmid               runs metaplasmidSPAdes pipeline for plasmid detection in metagenomic datasets (equivalent for --meta --plasmid)
  --rnaviral                  this flag enables virus assembly module from RNA-Seq data
  --iontorrent                this flag is required for IonTorrent data
  --test                      runs SPAdes on toy dataset
  -h, --help                  prints this usage message
  -v, --version               prints version

Advanced options:
  --dataset       <filename>  file with dataset description in YAML format
  -t/--threads    <int>       number of threads [default: 16]
  -m/--memory     <int>       RAM limit for SPAdes in Gb (terminates if exceeded) [default: 250]
  --tmp-dir       <dirname>   directory for temporary files [default: <output_dir>/tmp]
  -k              <int,int..> comma-separated list of k-mer sizes (must be odd and less than 128) [default: 'auto']
  --cov-cutoff <float>        coverage cutoff value (a positive float number, or 'auto', or 'off') [default: 'off']
  --phred-offset <33 or 64>   PHRED quality offset in the input reads (33 or 64) [default: auto-detect]
  --custom-hmms <dirname>     directory with custom hmms that replace default ones [default: None]

For full usage documentation, run spades.py --help.

Example job

Selecting the number of threads and memory

By default, SPAdes will run multi-threaded on 16 cores and 250Gb (or all available memory for nodes with less than 250Gb). To prevent overloading a compute node, you should override this by passing the --threads parameter with the value of ${NSLOTS} and the --memory parameter with the value of ${SGE_HGR_m_mem_free%.*}.

Serial job

Here is an example job running on 1 core and 1GB of memory:

#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 1
#$ -l h_rt=1:0:0
#$ -l h_vmem=1G

module load spades

spades.py -o <output_dir> \
          -1 example1.fastq \
          -2 example2.fastq \
          --threads ${NSLOTS} \
          --memory ${SGE_HGR_m_mem_free%.*}

References