Video Tutorials

ARC provides a number of video tutorials on our channel on video.vt.edu. In particular, the following sequence walks a user through the fundamentals of ARC usage in less than an hour:

Accessing Software

The following videos will walk the user through accessing software that ARC has installed or through setting up your own packages:

Note that these videos require a VT Login to access. Also, each video has a table of contents that can be used to skip between sections; this can be accessed by clicking the “hamburger” (three horizontal bars) button at the top left of the video.

We are actively updating this page.

We try to make very few assumptions about the experience of the user so that videos are of value to novices and those with experience. All users of all types are welcome on ARC resources and are encouraged to use them.

We try to keep things modular so that you can pin-point the video(s) that are useful to you.

General Overview

These videos provides a general overview of use of ARC systems.

Overview of ARC systems.
- Overview of ARC systems and jobs
How to get help with and how to learn about ARC resources.
- Help and learn
A brief description of the relationship between ARC (text) documentation and ARC videos.
- Complementing ARC docs

Data on ARC Clusters: Your Data Are Not Backed Up by ARC

Data backups on ARC clusters, should you need them, are YOUR RESPONSIBILITY as a user. ARC does not back up any data on any cluster. However, there is an option to back up data (statically) for a cost.
- All data backups are a user’s responsbility; there is a cost-based option

Connecting to ARC Clusters

There are multiple ways to connect to ARC clusters.

How to access ARC clusters using a terminal window on your laptop and ssh (and streamlining access for subsequent logins). This approaches uses the command line for working on clusters.
- Login with SSH
- Using SSH Keys and Agent to simplify logins
How to connect to ARC clusters using VS Code. This approach uses an IDE for working on clusters.
- Connect to ARC Systems with VS Code
How to access ARC clusters using Open OnDemand (OOD). This approach uses an assortment of ways for working on clusters (e.g., command line, UIs).
- Open OnDemand

Directory Structures and Mounts

This video describes the various mounts/directories for doing different types of work and for storing files.
- Mounts and Directories

How to Run Codes—Your Own or Commercial/Open Software

Whether you run your own code, e.g., built using some programming language, or a commercial/open source code such as Ansys, you will be interacting with the scheduler. Our scheduler on ARC clusters is the Slurm Scheduler. The first video is an excellent and concise introduction for running interactive and batch jobs. The subsequent videos are motivated by cluster configurations that increase throughput of jobs on the clusters and signficantly reduce wait time. To specify more precisely job resources, concepts such as constraints are needed that are not covered in the first video.

How to run both interactive and batch jobs.
- Interactive and batch jobs
Learn the different types of compute nodes on clusters and how to specify them for your jobs. Watch this before the later videos of this section.
- Compute node partitions on clusters
A more detailed example of an interactive job that uses the material in the second video—that is, how to use constraints to specify compute node types.
- Interactive jobs with constraints
A more detailed example of a batch job that uses the material in the second video—that is, how to use constraints to specify compute node types.
- Batch jobs with constraints
Examples of slurm batch jobs where files (input files, code files, output files) use volatile resources [to speed file input/output]
- Batch jobs using volatile resources

Self-Monitoring Your Activities and Code Execution to Understanding How Your Code Is Performing

ARC clusters are communal resources. They work best when everyone knows the purposes for different types of computer nodes (i.e., computers) on a cluster. These videos describe appropriate (and inappropriate) uses, and just as importantly, how to monitor your own use of the computing resources. And monitoring your (computational) jobs can give you insights about your codes and their performance. If you find that your code in a batch job is not performing as expected/desired, you can terminate the job. Similarly, if you find that you are (inadvertently) running prohibited types of processes on head nodes, you can terminate those.

Appropriate use of login (head) nodes of ARC clusters.
- Appropriate uses of login and compute nodes
How you can self-monitor your use of login (head) nodes and kill/terminate processes that should not be run on head nodes.
- How to self-monitor login nodes
Working with Slurm batch jobs.
- How you can self-monitor your Slurm batch jobs and understand your code’s performance.
  - How to self-monitor compute nodes and your Slurm jobs
- How to terminate slurm batch jobs.
  - How to kill or cancel a Slurm batch job

File Compression and Archiving

File compression and file archiving are two separate activities. However, they can also be used in combination. These videos describe how to compress and archive files.

How to compress files.
- Using file compression tools
How to create an archive (a single file) that contains many files, and how to compress this resulting file.
- Creating a single file (archive) containing many other files

File Transfer: Transferring files onto and off of clusters

One can use these tools to transfer files between and within machines/computers; here, our focus is ARC clusters.

Overview of utilities/tools for copying files onto and off of ARC clusters.
- Overview of file copying/movement tools
How to use scp (secure copy protocol) for copying files onto and off of ARC clusters.
- Using scp to transfer files
How to use sftp (secure file transfer protocol) for copying files onto and off of ARC clusters.
- Using sftp to transfer files
How to use rsync for copying files onto and off of ARC clusters, and copying files within a cluster’s storage.
- Using rsync to transfer files
How to use Globus for copying (very large) files between computers (e.g., onto and off of ARC clusters) with fault tolerance and a UI.
- Synopsis: Globus is a very powerful tool. While the other file transfer tools can complete their work with a single command, Globus is more of a file transfer environment. Hence it is more complicated than the other tools and has higher start-up cost for you. But do not be put off: it is not that complicated and it is VERY powerful. There is a reason why Globus is the defacto standard tool for large file transfers. Here we present a sequence of videos, which should be watched in order, because some videos depend on content from previous videos in this series. We have broken the information into shorter videos so that users can “enter” this sequence where it is appropriate for them.
  - Globus overview. For those who do not know Globus or want a refresher in what it does.
    - Globus overview
  - Globus prerequisites. Complete these steps (which are required regardless of Globus use) before working with Globus—it will make your life much easier. NEED TO ADD HOW TO MAKE PROJECT AREA VISIBLE TO GLOBUS USING COLDFRONT.
    - Globus prerequisites
  - Roadmap of Globus-specific videos. This short video demonstrates the unified view in presenting primary globus features in the remaining videos.
    - Globus videos roadmap
  - Unlinking previous Globus identity. This video is only relevant for those people coming to VT from another institution where they had a Globus account. Because in general your Globus identity is intimately tied to your present institution, and you had a Globus account at a previous institution, you must update your Globus account so that it is affiliated with your new institution (which is presumably Virginia Tech). This video shows how to “unlink” from your previous university. Then, in videos below, you can follow those instructions to establish your new VT-affiliated Globus identity, which are the same steps as creating a VT-affiliated Globus account for someone who has never had a Globus account.
    - Unlinking previous Globus identity
  - Create a Globus account and log in. This is used by new Globus users and those Globus users who have come from another institution and have unlinked their previous Globus identity. We also ensure that you can see the main VT collection.
    - Create a Globus account and log in
  - File transfer demonstration using Globus. This demo is a file transfer within one collection on one cluster. The explanation is generalized to illustrate how to perform file transfers between two clusters using different collections. This demo also shows how the directories Globus displays are the same as those under “/projects” on the ARC clusters.
    - Globus file transfer
  - Install Globus Connect Personal (GCP) on your laptop. GCP is used with Globus to enable file transfers between your local computer and ARC clusters.
    - Install Globus Connect Personal on your local computer
  - File transfer demonstration using Globus and Globus Connect Personal (GCP). GCP is used with Globus to enable file transfers between your local computer and ARC clusters.
    - Transfer files between your local computer and ARC clusters
  - Setup Globus Guest Collections. Globus Guest Collections (GGCs) are areas of storage that are accessible to you and other people that you designate. These “others” can be colleagues at other institutions around the world, making for powerful file sharing.
    - TODO
  - Setting permissions on files within Globus collections. Just because some directory within a collection is visible to Globus does NOT mean that all of the files and directories within the collection are visible. You can still use Linux/Unix permissions to control Globus’ access to your files and directories. This video shows the issues and how you can control file and directory visibility.
    - TODO

Environments

Environments are collections of software that tailor an otherwise “basic” computing ecosystem into one that supports your particular computing needs for a particular type of task. You may need different environments for your different tasks. There is a series of videos here on

motivation,
how to use modules,
how to structure directories to house your virtual environments,
how to construct and use virtual environments (VEs) for
- command-line execution of code and
- use with applications like Jupyter notebooks.

Motivation: why we need environments.
- Motivation for environments
Modules–a backbone of customizing your environments.
- How to use modules
Ways to think about structuring the locations of virtual environments to organize the (cluster, compute node type) for which they are used.
- Organizing your virtual environments
How to create and use Conda virtual environments on Owl (and other) clusters.
- Create a Conda VE on Owl for Python
- Run a python code using a Conda VE on Owl
How to create and use a Python pip-venv virtual environment (VE) on Owl (and other clusters).
- Create a Pip-Venv VE on Owl for Python
- Run a python code using a Pip-Venv VE on Owl
How to create and use Python Conda virtual environments with Jupyter notebooks (through OOD [Open OnDemand]).
- Create a Conda VE on Owl for Jupyter notebooks and Python
- Run a Jupyter notebook with a Conda VE on Owl
How to create and use Julia virtual environments on clusters.