ARC Storage Changes - Spring 2022

During spring break in March 2022, ARC brought online a new IBM ESS storage system running their GPFS software take over the duties of hosting /projects storage for Tinkercliffs and Infer clusters and to syncronize the data from the old system. /projects was previously hosted by a Dell storage system running BeeGFS, but was plagued by frequent outages since it came online in fall of 2020.

Outline of Impacts

Impact on /projects on Tinkercliffs and Infer

  • GPFS replaced BeeGFS

  • remains available but is now hosted on a new platform

  • aggregate space available on the hosting system is reduced, but is sufficient to host current needs

  • quotas are tallied differently, but more intuitively

  • beegfs-ctl --getquota ... commands are no longer valid because they’re specific to BeeGFS

  • a few groups will find they are now over the quota and will need to address this

Impact on /work on Tinkercliffs and Infer

  • Previously hosted by BeeGFS as 1TB personal scratch storage. This mount is being discontinued immediately.

  • /work mount point is empty and $WORK environment variable points to an invalid path, so job scripts reliant on these need to be changed

  • WORK data will remain temporarily available with read-only access at /beegfs/work on the login nodes only until Monday March 28, 2022.

  • Users must migrate any data they need to keep from there to other appropriate file systems prior to March 28, 2022.

ARC Begins a “Use Scratch for Jobs” campaign

  • /fastscratch and Local Scratch storage provide superior performance and scale for in-job storage operations (input/output or I/O).

  • Regular cleanup of files will commence in /fastscratch. Files older than 90 days will be purged starting on ______ (date TBA)

Cascades, Dragonstooth, Huckleberry unaffected at this time

  • /work and /groups on these clusters is hosted by a pre-existing GPFS storage system which remains in production

  • That GPFS filesystem is, however, reaching its end-of-life and groups should begin the work of archiving old data and migrating active projects to new systems.

  • Huckleberry will be decommissioned at the end of the Spring 2022 semester.

  • Cascades and Dragonstooth clusters do not yet have decommission dates established, but are quickly approaching their end-of-life as well.

Actions you may need to take

Get within quota on /projects

If you are over quota on your PROJECT directory, everyone in your group will be unable to write to that directory. While the quotas limits have not changed, the method of calculating usage has. Quota usage on /projects is now computed as the aggregate size of all files in the directory. Previously, some files in the directory would not be considered in the usage if their group ownership was inconsistent with the directory. This is not the case anymore, and that change may put some groups over their quota. To get under the quota, you need to reduce the total usage in that directory. Here are some approaches to consider:

  • delete unneeded data

  • archive data which is not being actively used

  • Use /fastscratch on Tinkercliffs to hold short-term working data (90 day limit)

  • Store large datasets in compressed archive files and only extract the data when it’s needed (this can also provide a big performance boost when paired with use of /localscratch).

Retrieve data that was previously in /work

If you were actively using /work or need to retain files/data from /work, then you have a short window of time to retrieve that data before the hosting BeeGFS system is taken offline. Those files are available through March 28, 2022 on Tinkercliffs and Infer login nodes at /beegfs/work.

Change jobscripts so that /work path and $WORK environment variable are not used

Job scripts for Tinkercliffs or Infer clusters which reference or otherwise rely on /work/ paths or the $WORK environment variable need to be adjusted to use different filesystems.

Note

The /work mount point is no longer available and the $WORK environment variable now expands to the invalid path /notavailable.

Note

For a limited time, /beegfs/work is available read-only so that data can be retrieved to other locations before BeeGFS is taken fully offline.

We recommend using /fastscratch on Tinkercliffs as a working space for staging jobs, but note that files which are unmodified for 90 days will be automatically deleted from this location. For longer-term storage, use a /projects storage location or remove/archive files to remove them from ARC active storage filesystems.

Get Help