Skip to content

Trim Galore

Trim Galore is a wrapper script to automate quality and adaptor trimming as well as quality control, with functionality to remove biased methylation positions for Reduced Representation Bisulfite-Seq sequence files.

Trim Galore is available as a module on Apocrita.

Usage

To run the default installed version of Trim Galore, simply load the trimgalore module:

module load trimgalore

For usage documentation, run trim_galore -help.

Example job

Serial job

Here is an example job running on 9 cores and 9GB of memory:

#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 9
#$ -l h_rt=1:0:0
#$ -l h_vmem=1G

module load trimgalore

trim_galore input.fastq \
            --fastqc \
            --cores 2 \
            --fastqc_args '--nogroup --outdir outdir' \
            --output_dir outdir

Notice from the manual regarding optimal core requests

Actual core usage: It should be mentioned that the actual number of cores used is a little convoluted.

Assuming that Python 3 is used and pigz is installed, --cores 2 would use 2 cores to read the input (probably not at a high usage though), 2 cores to write to the output (at moderately high usage), and 2 cores for Cutadapt itself + 2 additional cores for Cutadapt (not sure what they are used for) + 1 core for Trim Galore itself.

So this can be up to 9 cores, even though most of them won't be used at 100% for most of the time.

Paired-end processing uses twice as many cores for the validation step. --cores 4 would then be: 4 (read) + 4 (write) + 4 (Cutadapt) + 2 (extra Cutadapt) + 1 (Trim Galore) = 15. It seems that --cores 4 could be a sweet spot, anything above has diminishing returns.

Based on this information, we recommend that if you are running repeat tasks with Trim Galore, you check the efficiency of your completed jobs with the jobstats command and adjust your requested core count appropriately.

References