Skip to content

Wtdbg2

Wtdbg2 is a de-novo sequence assembler for long noisy reads. It assembles raw reads without error correction and then builds the consensus from intermediate assembly output.

Conda installation

Wtdbg2 can be installed from the Bioconda Anaconda channel by loading the Miniforge module and then creating a Conda environment and installing Wtdbg2 into it (output below truncated):

$ module load miniforge
$ mamba create --quiet --yes --name wtdbg2_env
$ mamba activate wtdbg2_env
(wtdbg2_env) $ mamba install bioconda::wtdbg

Looking for: ['bioconda::wtdbg']
...
  Updating specs:

   - bioconda::wtdbg
...
Confirm changes: [Y/n] Y
...
Downloading and Extracting Packages:

Preparing transaction: done
Verifying transaction: done
Executing transaction: done

Usage

To run Wtdbg2, simply load the miniforge module and activate the Conda environment you installed it into:

$ module load miniforge
$ mamba activate wtdbg2_env
(wtdbg2_env) $ wtdbg2 --help
Usage: wtdbg2 [options] -i <reads.fa> -o <prefix> [reads.fa ...]

For usage documentation, run wtdbg2 --help.

Core Usage

To ensure that Wtdbg2 uses the correct number of cores, the -t ${NSLOTS} option must be used,

Example job

Serial job

Here is an example job running on 4 cores and 16GB of memory:

#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 4
#$ -l h_rt=1:0:0
#$ -l h_vmem=4G

module load miniforge
mamba activate wtdbg2_env

# Overlap and layout the reads
wtdbg2 -i assembly.fa.gz \
       -g 300M \
       -o outdir \
       -t ${NSLOTS}

# Derive consensus sequence for the contigs
wtpoa-cns -i assembly.ctg.lay.gz \
          -o outdir \
          -t ${NSLOTS}

Reference