Skip to content

MaSuRCA

MaSuRCA is an assembly algorithm for both PacBio and Illumina data that combines the benefits of De Bruijn graph and Overlap-Layout-Consensus assembly approaches.

MaSuRCA is available as a module on Apocrita.

Usage

To run the default installed version of MaSuRCA, simply load the masurca module:

$ module load masurca
$ masurca -h
Options:
 -t, --threads          ONLY to use with -i option, number of threads
 -i, --illumina         Run assembly without creating configuration file,
                        argument can be illumina_paired_end_forward_reads or
                        illumina_paired_end_forward_reads,
                        illumina_paired_end_reverse_reads.
                        Illumina read file names must be comma-separated,
                        without a space in the middle.
                        Illumina read files must be fastq, with valid quality
                        values, can be gzipped.
 -r, --reads            ONLY to use with -i option, single long reads file for
                        hybrid assembly, can be Nanopore or PacBio, fasta or
                        fastq, can be gzipped

 -v, --version          Report version
 -o, --output           Assembly script (assemble.sh)
 -g, --generate         Generate example configuration file
 -p, --path             Prepend to PATH in assembly script
 -l, --ld-library-path  Prepend to LD_LIBRARY_PATH in assembly script
     --skip-checking    Skip checking availability of other executables
 -h, --help             This message

For usage documentation, run masurca --help.

Example job

Selecting the number of threads

To prevent overloading a compute node, you should include the NUM_THREADS=X parameter in your configuration file, where X is equal to the number of cores requested.

Serial job

Here is an example job running on 2 cores and 4GB of memory:

#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 2
#$ -l h_rt=1:0:0
#$ -l h_vmem=2G

module load masurca

masurca example.cfg
./assemble.sh

Here is the supporting example.cfg file:

DATA
PE= pe 180 20 /path/to/example.fastq
END

PARAMETERS
GRAPH_KMER_SIZE=auto
NUM_THREADS=2
JF_SIZE=200000000
END

Reference