Disk Space Selection for SLURM Jobs

From TAMUQ Research Computing User Documentation Wiki
Jump to navigation Jump to search


Overview

There are several ways in which Raad2 users may use disk resources to perform I/O within their slurm jobs. Some of these are listed below:

Option A: Use the space within your home directory (/ddn/home/username or /ddn/gfxhome/username; or even your /ddn/warehouse/username directory if you have one).

Option B: Use the space within your /ddn/scratch/username directory.

Option C: Use the space under the /tmp directory on an individual compute node itself.

Use Your Home Directory

Use the Shared Scratch Directory

Use /tmp on the Compute Node(s)

On the GPU Cluster Nodes

On the XC40 Compute Nodes

These are the nodes allocated to your job if you submit your workload from the raad2a or raad2b login nodes. These nodes are "diskless" and do not actually have local hard drives attached. Most of the directory structure one sees on such compute nodes exists on a "simulated" disk that is carved out of the node's memory capacity (hence referred to as a "ramdisk") but made to appear to the node as a local hard drive. The main exception to this is the /ddn directory which is a network-mounted disk storage system that is shared with all other compute nodes as well.

Any data read from or written to a file that does not reside under the /ddn directory does not therefore leave the compute node; in other words, such data does not have to traverse any external network to reach the program that requested it, or to reach the file into which it is to be written. Everything stays within the confines of the compute node, making such read or write operations relatively fast. Such would be the case for anything read from or written to any file under the /tmp directory of a computer node.

In contrast, a program running on the compute node that needs to fetch data from some file "/ddn/home/fachaud74/specialProj/dataset1/file99.txt" will have to wait longer in order for that data to be extracted from a remote storage system, and for it to travel over an external network (often more than one network) until it reaches the requesting program on the given compute node. A similar process would take place for writing data, just in the opposite direction.

Consequently, doing I/O to a file under /tmp should be much faster than doing I/O to a file under /ddn.

Tmp-ram-disk.jpg

Under Construction