Trim Galore¶
Trim Galore is a wrapper script to automate quality and adaptor trimming as well as quality control, with functionality to remove biased methylation positions for Reduced Representation Bisulfite-Seq sequence files.
Trim Galore is available as a module on Apocrita.
Usage¶
To run the default installed version of Trim Galore, simply load the trimgalore
module:
module load trimgalore
For usage documentation, run trim_galore -help
.
Example job¶
Serial job¶
Here is an example job running on 9 cores and 9GB of memory:
#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 9
#$ -l h_rt=1:0:0
#$ -l h_vmem=1G
module load trimgalore
trim_galore input.fastq \
--fastqc \
--cores 2 \
--fastqc_args '--nogroup --outdir outdir' \
--output_dir outdir
Notice from the manual regarding optimal core requests
Actual core usage: It should be mentioned that the actual number of cores used is a little convoluted.
Assuming that Python 3 is used and pigz is installed, --cores 2
would use
2 cores to read the input (probably not at a high usage though), 2 cores to
write to the output (at moderately high usage), and 2 cores for Cutadapt
itself + 2 additional cores for Cutadapt (not sure what they are used for)
+ 1 core for Trim Galore itself.
So this can be up to 9 cores, even though most of them won't be used at 100% for most of the time.
Paired-end processing uses twice as many cores for the validation step.
--cores 4
would then be: 4 (read) + 4 (write) + 4 (Cutadapt) + 2 (extra
Cutadapt) + 1 (Trim Galore) = 15. It seems that --cores 4
could be a
sweet spot, anything above has diminishing returns.
Based on this information, we recommend that if you are running repeat tasks
with Trim Galore, you check the efficiency of your completed jobs with the
jobstats
command and adjust your requested core count appropriately.