SPAdes¶

SPAdes is an assembly toolkit containing various assembly pipelines.

SPAdes is available as a module on Apocrita.

Usage¶

To run the default installed version of SPAdes, simply load the spades module:

$ module load spades
$ spades.py --help

SPAdes genome assembler <VERSION>

Usage: spades.py [options] -o <output_dir>

Basic options:
  -o <output_dir>             directory to store all the resulting files (required)
  --isolate                   this flag is highly recommended for high-coverage isolate and multi-cell data
  --sc                        this flag is required for MDA (single-cell) data
  --meta                      this flag is required for metagenomic data
  --bio                       this flag is required for biosyntheticSPAdes mode
  --corona                    this flag is required for coronaSPAdes mode
  --rna                       this flag is required for RNA-Seq data
  --plasmid                   runs plasmidSPAdes pipeline for plasmid detection
  --metaviral                 runs metaviralSPAdes pipeline for virus detection
  --metaplasmid               runs metaplasmidSPAdes pipeline for plasmid detection in metagenomic datasets (equivalent for --meta --plasmid)
  --rnaviral                  this flag enables virus assembly module from RNA-Seq data
  --iontorrent                this flag is required for IonTorrent data
  --test                      runs SPAdes on toy dataset
  -h, --help                  prints this usage message
  -v, --version               prints version

Advanced options:
  --dataset       <filename>  file with dataset description in YAML format
  -t/--threads    <int>       number of threads [default: 16]
  -m/--memory     <int>       RAM limit for SPAdes in Gb (terminates if exceeded) [default: 250]
  --tmp-dir       <dirname>   directory for temporary files [default: <output_dir>/tmp]
  -k              <int,int..> comma-separated list of k-mer sizes (must be odd and less than 128) [default: 'auto']
  --cov-cutoff <float>        coverage cutoff value (a positive float number, or 'auto', or 'off') [default: 'off']
  --phred-offset <33 or 64>   PHRED quality offset in the input reads (33 or 64) [default: auto-detect]
  --custom-hmms <dirname>     directory with custom hmms that replace default ones [default: None]

For full usage documentation, run spades.py --help.

Example job¶

Selecting the number of threads and memory

By default, SPAdes will run multi-threaded on 16 cores and 250Gb (or all available memory for nodes with less than 250Gb). To prevent overloading a compute node, you should override this by passing the --threads parameter with the value of ${SLURM_NTASKS} and the --memory parameter with the value of the total memory requested for the job.

Serial job¶

Here is an example job running on 1 core and 1GB of memory:

#!/bin/bash
#SBATCH --ntasks=1        # (or -n 1) Request 1 core
#SBATCH --mem-per-cpu=1G  # Request 1GB RAM per core (1GB total)
#SBATCH --time=1:0:0      # (or -t 1:0:0) Request 1 hour runtime

module load spades

spades.py -o <output_dir> \
          -1 example1.fastq \
          -2 example2.fastq \
          --threads ${SLURM_NTASKS} \
          --memory 1

Reference¶

SPAdes website