Skip to content

FreeBayes

FreeBayes is a Bayesian genetic variant detector designed to find small polymorphisms, specifically single-nucleotide polymorphisms, indels (insertions and deletions), multi-nucleotide polymorphisms, and complex events (composite insertion and substitution events) smaller than the length of a short-read sequencing alignment.

Conda installation

FreeBayes can be installed from the Bioconda Anaconda channel by loading the Miniforge module and then creating a Conda environment and installing FreeBayes into it (output below truncated):

$ module load miniforge
$ mamba create --quiet --yes --name freebayes_env
$ mamba activate freebayes_env
(freebayes_env) $ mamba install bioconda::freebayes

Looking for: ['bioconda::freebayes']
...
  Updating specs:

   - bioconda::freebayes
...
Confirm changes: [Y/n] Y
...
Downloading and Extracting Packages:

Preparing transaction: done
Verifying transaction: done
Executing transaction: done

Usage

To run FreeBayes, simply load the miniforge module and activate the Conda environment you installed it into:

$ module load miniforge
$ mamba activate freebayes_env
(freebayes_env) $ freebayes -h
usage: freebayes -f [REFERENCE] [OPTIONS] [BAM FILES] >[OUTPUT]

For full usage documentation, run freebayes -h (or freebayes-parallel -h for the multi-core version).

Example jobs

Serial job

Here is an example job running on 1 core and 1GB of memory:

#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 1
#$ -l h_rt=1:0:0
#$ -l h_vmem=1G

module load miniforge
mamba activate freebayes_env

freebayes --fasta-reference <fastafile> <bamfile>

Serial job (multi-core)

FreeBayes Parallel Usage

Although FreeBayes uses the command freebayes-parallel for multi-core jobs, it does not have MPI support and therefore cannot run parallel jobs across multiple nodes.

Here is an example job running on 12 cores and 12GB of memory:

#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 12
#$ -l h_rt=1:0:0
#$ -l h_vmem=1G

module load miniforge
mamba activate freebayes_env

freebayes-parallel <(fasta_generate_regions.py <fastafile> <region size>) \
${NSLOTS} -f <fastafile> <bamfile> > <output>

References