Using $TMPDIR¶
There is temporary space available on the nodes that can be used when you submit a job to the cluster.
As this storage is physically located on the nodes, it is not shared between nodes, but it will provide better performance for read/write (I/O) intensive tasks on a single node than networked storage. However, to use the temporary scratch space, you will need to copy files from networked storage to the temporary scratch space. In addition, if a job fails then any intermediate files created may be lost.
If your job does a lot of I/O operations to large files, it may therefore improve performance to:
- copy files from your home directory into the temporary folder
- run your job in the temporary folder
- copy files back from the temporary folder to your home directory if needed
- delete them from the temporary folder as soon as they're no longer needed
Basic example¶
The following job runs a shell-script ./runcode.sh in a data folder beneath a
user's home directory. The data is held on networked storage at this point.
#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 1
#$ -l h_rt=1:0:0
#$ -l h_vmem=2G
cd $HOME/project
./runcode.sh
On any node the temporary scratch directory is accessed using the variable
$TMPDIR. If specific, known files are needed in your processing, you can copy
your data to that space before working on it.
The following job:
- copies data.filefrom theprojectdirectory to the temporary area
- sets the current working directory to the temporary area
- runs the appropriate code
- copies the output file results.databack to theprojectdirectory
This is the equivalent of the previous example, but using the temporary storage.
#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 1
#$ -l h_rt=1:0:0
#$ -l h_vmem=2G
# Copy data.file from the project directory to the temporary scratch space
cp $HOME/project/data.file $TMPDIR
# Move into the temporary scratch space where your data now is
cd $TMPDIR
# Do processing - as this is a small shell script, it is run from the network storage
$HOME/project/runcode.sh
# Copy results.data back to the project directory from the temporary scratch space
cp $TMPDIR/results.data $HOME/project/
If you do not know, or cannot list all the possible output files that you would
like to move back to you home directory you can use rsync to only copy
changed and new files back at the end of the job. This will save time and avoid
unnecessary copying.
The following job:
- copies files to the temporary scratch area
- runs the shell-script ./runcode.shon the local copy
- copies the results back to networked storage
#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 1
#$ -l h_rt=1:0:0
#$ -l h_vmem=2G
# Source folder for data
DATADIR=$HOME/project
# Copy data (inc. subfolders) to temporary storage
rsync -rltv $DATADIR/ $TMPDIR/
# Run job from temporary folder
cd $TMPDIR
./runcode.sh
# Copy changed files back
rsync -rltv $TMPDIR/ $DATADIR/
Viewing temporary files¶
To view temporary files while the job is running (to ensure the job is correct)
you can ssh to the node.
The path of the file is made up of job id, task id and queue name.
$ qstat
3672630 5.00638 tempFilejob abc123 r 04/08/2016 14:20:57 all.q@sdx2 4
$ ssh sdx2
$ ls /tmp/3672630.1.all.q
temp_file1  temp_file2
SSH Connections
As per the Usage Policy SSH sessions on nodes should be limited to monitoring jobs.
Advanced example¶
This advanced example uses rsync for speed and will ensure cleanup happens at the end of a job or when the job hits the soft limit.
#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 1
#$ -l h_vmem=2G
#$ -l h_rt=1:0:0   # Request 1 hour runtime
#$ -l s_rt=0:55:0  # Clean up after 55 minutes
function Cleanup ()
{
    trap "" SIGUSR1 EXIT # Disable trap now we're in it
    # Clean up task
    rsync -rltv $TMPDIR/ $DATADIR/
    exit 0
}
DATADIR=$(pwd)
trap Cleanup SIGUSR1 EXIT # Enable trap
cd $TMPDIR
rsync -rltv $DATADIR/ $TMPDIR/
# Job
./runcode.sh