ABySS¶
ABySS is a parallel, paired-end sequence assembler that is designed for short reads. It is implemented using MPI and is capable of assembling large genomes.
Installation¶
Conda installation (recommended)¶
ABySS can be installed from the Bioconda Anaconda channel by loading the Miniforge module and then creating a Conda environment and installing ABySS into it (output below truncated):
$ module load miniforge
$ mamba create --quiet --yes --name abyss-env
$ mamba activate abyss-env
(abyss-env) $ mamba install -c bioconda -c conda-forge abyss
Looking for: ['abyss']
...
Updating specs:
- abyss
...
Confirm changes: [Y/n] Y
...
Downloading and Extracting Packages:
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
(abyss-env) $ ABYSS --help
Usage: Usage: ABYSS -k<kmer> -o<output.fa> [OPTION]... FILE...
Assemble the input files, FILE, which may be in FASTA, FASTQ,
qseq, export, SAM or BAM format and compressed with gz, bz2 or xz.
Usage¶
To run the Conda installed version of ABySS, simply load the miniforge
module
and activate the Conda environment you installed it into:
$ module load miniforge
$ mamba activate abyss-env
(abyss-env) $ ABYSS --help
Usage: Usage: ABYSS -k<kmer> -o<output.fa> [OPTION]... FILE...
Assemble the input files, FILE, which may be in FASTA, FASTQ,
qseq, export, SAM or BAM format and compressed with gz, bz2 or xz.
For full usage documentation, run ABYSS --help
.
Example job¶
Parallel job¶
Here is an example job running on 96 cores across 2 ddy nodes using MPI:
#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe parallel 96
#$ -l infiniband=ddy-i
#$ -l h_rt=240:0:0
module load miniforge
mamba activate abyss-env
abyss-pe --directory=output k=25 j=${NSLOTS} \
name=test B=1G in='reads1.fastq reads2.fastq'