Skip to content

Stacks

Stacks is a software pipeline for building loci from short-read sequences, such as those generated on the Illumina platform. Stacks was developed to work with restriction enzyme-based data, such as RAD-seq, for the purpose of building genetic maps and conducting population genomics and phylogeography.

Conda installation

Stacks can be installed from the Bioconda Anaconda channel by loading the Miniforge module and then creating a Conda environment and installing Stacks into it (output below truncated):

$ module load miniforge
$ mamba create --quiet --yes --name stacks_env
$ mamba activate stacks_env
(stacks_env) $ mamba install bioconda::stacks

Looking for: ['bioconda::stacks]
...
  Updating specs:

   - bioconda::stacks
...
Confirm changes: [Y/n] Y
...
Downloading and Extracting Packages:

Preparing transaction: done
Verifying transaction: done
Executing transaction: done

Usage

To run Stacks, simply load the miniforge module and activate the Conda environment you installed it into:

module load miniforge
mamba activate stacks_env

After activating the Conda environment, the following commands are available:

process_radtags      Examines raw reads from an Illumina sequencing run and
                     first, checks that the barcode and the RAD cutsite are
                     intact, and demultiplexes the data.

process_shortreads   Performs the same task as process_radtags for fast
                     cleaning of randomly sheared genomic or transcriptomic
                     data, not for RAD data.

clone_filter         Designed to identify PCR clones.

kmer_filter          Allows paired or single-end reads to be filtered according
                     to the number or rare or abundant kmers they contain.

ustacks              Takes as input a set of short-read sequences and aligns
                     them into exactly-matching stacks (or putative alleles).

cstacks              Builds a catalog from any set of samples processed by the
                     ustacks or pstacks programs.

sstacks              Sets of stacks, i.e. putative loci, constructed by the
                     ustacks program can be searched against a catalog produced
                     by cstacks.

tsv2bam              Transpose data so that it is oriented by locus, instead
                     of by sample.

gstacks              Examines a RAD data set one locus at a time, looking at
                     all individuals in the metapopulation for that locus.

populations          Analyze a population of individual samples computing a
                     number of population genetics statistics as well as
                     exporting a variety of standard output formats.

The following scripts are included in the Stacks package and allows preset pipelines to be run:

denovo_map.pl
ref_map.pl

For usage documentation, run <command> -h, and see the extensive online documentation.

Example job

Serial job

The simplest way to execute the entire Stacks pipeline is to run it via the denovo_map.pl program.

Here is an example job running on 2 cores and 8GB of memory:

#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 2
#$ -l h_rt=1:0:0
#$ -l h_vmem=4G

module load miniforge
mamba activate stacks_env

denovo_map.pl -T ${NSLOTS} -M 4 -n 4 -o ./stacks/ \
              --samples ./samples --popmap ./popmaps/popmap

References