Skip to content

ANGSD

ANGSD can calculate various summary statistics, and perform association mapping and population genetic analyses utilising the full information in next generation sequencing data.

Conda installation

ANGSD can be installed from the Bioconda Anaconda channel by loading the Miniforge module and then creating a Conda environment and installing ANGSD into it (output below truncated):

$ module load miniforge
$ mamba create --quiet --yes --name angsd_env
$ mamba activate angsd_env
(angsd_env) $ mamba install bioconda::angsd

Looking for: ['bioconda::angsd']
...
  Updating specs:

   - bioconda::angsd
...
Confirm changes: [Y/n] Y
...
Downloading and Extracting Packages:

Preparing transaction: done
Verifying transaction: done
Executing transaction: done

Usage

Use Conda for R and R packages

ANGSD is often combined with R. You should not load the Apocrita R module in these cases but instead use Conda for R and all R packages instead.

To run the Conda installed version of ANGSD, simply load the miniforge module and activate the Conda environment you installed it into:

$ module load miniforge
$ mamba activate angsd_env
(angsd_env) $ angsd
    -> angsd version: 0.940-dirty (htslib: 1.21) build(Apr  3 2024 04:57:10)
    -> angsd
    -> No '-out' argument given, output files will be called 'angsdput'

    -> angsd version: 0.940-dirty (htslib: 1.21) build(Apr  3 2024 04:57:09)
    -> Please use the website "http://www.popgen.dk/angsd" as reference
    -> Use -nThreads or -P for number of threads allocated to the program
    Overview of methods:
    -GL         Estimate genotype likelihoods
    -doCounts   Calculate various counts statistics
    -doAsso     Perform association study
    -doMaf      Estimate allele frequencies
...

Example job

Serial job

Here is an example job running with 2 cores and 10GB of memory:

#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 2
#$ -l h_rt=1:0:0
#$ -l h_vmem=5G

module load miniforge
mamba activate angsd_env

angsd -out outFile \
      -bam bam.list \
      -GL 1 \
      -doMaf 1 \
      -doMajorMinor 1 \
      -nThreads ${NSLOTS}

In this example, allele frequencies are estimated from genotype likelihoods with bam files as input using 2 threads.

References