Storage on Klone
Klone has a number of storage options available to users. This page describes the options and best practices for using them.
Overview
The Klone cluster makes available several locations for storing data, some of which are shared across the login nodes and compute nodes, and some of which are only available to the node where a job is running.
Shared storage
The following storage locations are shared across the login nodes and compute nodes:
Home directory
Every user on Klone has a home directory, which is the default location for storing files. The home directory is located at /mmfs1/home/<username>
. The home directory is where you are when you log in to the login node, and it is also acessible by any compute node where you have a job running.
None of the storage locations on Klone are backed up. If you need to store data that you cannot afford to lose, you should use a different storage location – see our guide to storage for more information.
Currently, there is a 10 GiB quota on the home directory. This quota is enforced by the system, and you will not be able to write to your home directory if you exceed it. You can check your quota with the hyakstorage
command when you are logged in to any Klone node:
- 1
-
The
--home
flag tellshyakstorage
to check your home directory quota.
The result will look something like this:
Usage report for /mmfs1/home/altan
╭───────────────────┬────────────────────────────┬─────────────────────────────╮
│ │ Disk Usage │ Files Usage │
├───────────────────┼────────────────────────────┼─────────────────────────────┤
│ Total: │ 1GB / 10GB │ 11579 / 256000 files │
│ │ 10% │ 5% │
╰───────────────────┴────────────────────────────┴─────────────────────────────╯
It is important to note that the home directory is not intended for storing large amounts of data. If you need to store more than a few GiB of data, you should use one of the other storage options described below.
A lot of software will save files to your home directory by default. Python, for example, will install packages to ~/.local/lib/python3.6
by default when they are installed using pip install --user
. Apptainer will also cache images in ~/.apptainer/cache
by default. Fortunately, both of these can be changed by setting environment variables or using solutions like virtual environments.
If you are installing software, you should always check to see where it is being installed, and make sure that it is not being installed to your home directory unless you have a good reason for doing so.
gscratch
Every group on Klone has a directory under /mmfs1/gscratch
(also aliased as /gscratch
), which is a good place to store data that you are using for a job, as well as any data and libraries you install. The gscratch
directory is accessible from any node.
Each group has a quota on their gscratch
directory, and you will not be able to write to your group’s gscratch
directory if you exceed it.
You can find out the locataion of your group’s gscratch
directory and see how much is used of your available capacity using the hyakstorage
command when you are logged in to any Klone node:
- 1
-
The
--gscratch
flag tellshyakstorage
to check your group’sgscratch
quota.
The result will look something like this:
Usage report for /mmfs1/gscratch/escience
╭───────────────────┬────────────────────────────┬─────────────────────────────╮
│ │ Disk Usage │ Files Usage │
├───────────────────┼────────────────────────────┼─────────────────────────────┤
│ Total: │ 3305GB / 4096GB │ 710698 / 4000000 files │
│ │ 81% │ 18% │
│ My usage: │ 41GB │ 11776 files │
╰───────────────────┴────────────────────────────┴─────────────────────────────╯
The line at the top shows the location of your group’s gscratch
directory.
Every lab or group has its own quota for /gscratch
storage. By default, the quota is 1 TiB per slice (or 4 TiB for a GPU slice). Additional storage is available at $10/TiB/month as of 2023-12-08.
As reported by the hyakstorage
utility, the /gscratch
filesystem and your home directory are configured to limit the number of files that can be stored by a user or group. If you exceed this limit, you will not be able to write to the filesystem. Furthermore, because of the way that the software underlying the filesystem (GPFS) works, having or accessing a large number of files can slow down the filesystem for everyone.
Some software is prone to creating a large number of files – pip
and conda
are common culprits. If you are using software that creates a large number of files, you should consider using Apptainer to run your software in a container. The files installed by the software will be compressed and stored in a single file (the container image), which will reduce the number of files on the filesystem and potentially speed up your software.
/gscratch/scrubbed
There is also a directory /gscratch/scrubbed
, which is a scratch directory where any data not accessed in 21 days is automatically deleted. This directory is slower than other directories under /gscratch
. There is no quota on this directory, but you should not store data here that you cannot afford to lose.
You can browse through the largest files and directories in any directory using the gdu command, which you can load using module load escience/gdu
when logged in to a compute node.
If you want to look up files nobody has accessed in a while, you can use the find
command. The following command will find all files in the current directory that have not been accessed in the last 14 days (atime +14
) and are larger than 1 MiB (-size +1M
):
find . -atime +14 -size +1M -printf '%AFT%AH:%AM\t%s\t%p\n' | \
awk '{printf "%s\t%dM\t%s\n", $1, $2/(1024*1024), $3}'
- 1
-
We use
awk
to process the file size in bytes thatfind
reports and convert it to MiB.