Video Tutorials
We are actively updating this page.
We try to make very few assumptions about the experience of the user so that videos are of value to novices and those with experience (we do not want to leave anyone behind). All users of all types are welcome on ARC resources and are encouraged to use them.
There is a learning curve to use ANY cluster system. There are many ways to get help from ARC. See the “How to Get Help” section below.
We try to keep things modular so that you can pin-point the video(s) that are useful to you.
If you click on a link to a video, and click the video to start, and it is not at the start position (so that you see a title slide), this has to do with the video storage system. Simply move the slider to the left (to the beginning of the file), and start the video. Each video file covers one topic, so no new video tutorial starts part-way-through the video that you watch.
How to Get Help
We are leading with “how to get help” because we want to be clear that ARC is here to help.
ARC wants to help you be productive. We have proactive methods (e.g., workshops and town halls) and reactive approaches (e.g., responding to help requests that you make) for working with you.
How to get help with and how to learn about ARC resources.
You can also watch the videos below where we attempt to make each video focused on one issue.
General Overview
These videos provides a general overview of use of ARC systems.
Overview of ARC systems.
A brief description of the relationship between ARC (text) documentation and ARC videos.
Connecting to ARC Clusters
There are multiple ways to connect to ARC clusters.
How to access ARC clusters using a terminal window on your laptop and ssh (and streamlining access for subsequent logins). This approaches uses the command line for working on clusters.
How to connect to ARC clusters using VS Code. This approach uses an IDE for working on clusters.
How to access ARC clusters using Open OnDemand (OOD). This approach uses an assortment of ways for working on clusters (e.g., command line, UIs).
Data on ARC Clusters: Your Data Are Not Backed Up by ARC
Data backups on ARC clusters, should you need them, are YOUR RESPONSIBILITY as a user. ARC does not back up any data on any cluster. However, there is an option to back up data (statically) for a cost.
File Systems, and File and Directory Permissions and Ownership
This video describes the various file systems and directories for doing different types of work and for storing files.
File and directory permissions are explained, and demonstrations are given on how to display permissions and how to change them. Default permissions are given.
Starting with given default file and directory permissions, which are based on a system-specified umask, the umask command is described. umask is then used to show how default permissions are specified, and how the umask can be changed by a user to change the default permissions on files and directories. These changed default permissions may be more advantageous for your particular group. This video should be viewed after the preceding one on file and directory permissions because several concepts are introduced in the previous video that are used in this one.
How to display and change file and directory individual and group ownership.
How to Run Codes—Your Own or Commercial/Open Software
Whether you run your own code, e.g., built using some programming language, or a commercial/open source code such as Ansys, you will be interacting with the scheduler. Our scheduler on ARC clusters is the Slurm Scheduler. The first video is an excellent and concise introduction for running interactive and batch jobs. The subsequent videos are motivated by cluster configurations that increase throughput of jobs on the clusters and signficantly reduce wait time. To specify more precisely job resources, concepts such as constraints are needed that are not covered in the first video.
How to run both interactive and batch jobs.
Learn the different types of compute nodes on clusters and how to specify them for your jobs. Watch this before the later videos of this section.
A more detailed example of an interactive job that uses the material in the second video—that is, how to use constraints to specify compute node types.
A more detailed example of a batch job that uses the material in the second video—that is, how to use constraints to specify compute node types.
Examples of slurm batch jobs where files (input files, code files, output files) use volatile resources [to speed file input/output]
Self-Monitoring Your Activities and Code Execution to Understanding How Your Code Is Performing
ARC clusters are communal resources. They work best when everyone knows the purposes for different types of computer nodes (i.e., computers) on a cluster. These videos describe appropriate (and inappropriate) uses, and just as importantly, how to monitor your own use of the computing resources. And monitoring your (computational) jobs can give you insights about your codes and their performance. If you find that your code in a batch job is not performing as expected/desired, you can terminate the job. Similarly, if you find that you are (inadvertently) running prohibited types of processes on head nodes, you can terminate those.
Appropriate use of login (head) nodes of ARC clusters.
How you can self-monitor your use of login (head) nodes and kill/terminate processes that should not be run on head nodes.
Working with Slurm batch jobs.
How you can self-monitor your Slurm batch jobs and understand your code’s performance.
How to terminate slurm batch jobs.
File Compression and Archiving
File compression and file archiving are two separate activities. However, they can also be used in combination. These videos describe how to compress and archive files.
How to compress files.
How to create an archive (a single file) that contains many files, and how to compress this resulting file.
File Transfer: Transferring files onto and off of clusters
One can use these tools to transfer files between and within machines/computers; here, our focus is ARC clusters.
Overview of utilities/tools for copying files onto and off of ARC clusters.
How to use scp (secure copy protocol) for copying files onto and off of ARC clusters.
How to use sftp (secure file transfer protocol) for copying files onto and off of ARC clusters.
How to use rsync for copying files onto and off of ARC clusters, and copying files within a cluster’s storage.
How to use Globus for copying (very large) files between computers (e.g., onto and off of ARC clusters) with fault tolerance and a UI.
Synopsis: Globus is a very powerful tool. While the other file transfer tools can complete their work with a single command, Globus is more of a file transfer environment. Hence it is more complicated than the other tools and has higher start-up cost for you. But do not be put off: it is not that complicated and it is VERY powerful. There is a reason why Globus is the defacto standard tool for large file transfers. Here we present a sequence of videos, which should be watched in order, because some videos depend on content from previous videos in this series. We have broken the information into shorter videos so that users can “enter” this sequence where it is appropriate for them.
Globus overview. For those who do not know Globus or want a refresher in what it does.
Globus prerequisites. Complete these steps (which are required regardless of Globus use) before working with Globus—it will make your life much easier. NEED TO ADD HOW TO MAKE PROJECT AREA VISIBLE TO GLOBUS USING COLDFRONT.
Roadmap of Globus-specific videos. This short video demonstrates the unified view in presenting primary globus features in the remaining videos.
Unlinking previous Globus identity. This video is only relevant for those people coming to VT from another institution where they had a Globus account. Because in general your Globus identity is intimately tied to your present institution, and you had a Globus account at a previous institution, you must update your Globus account so that it is affiliated with your new institution (which is presumably Virginia Tech). This video shows how to “unlink” from your previous university. Then, in videos below, you can follow those instructions to establish your new VT-affiliated Globus identity, which are the same steps as creating a VT-affiliated Globus account for someone who has never had a Globus account.
Create a Globus account and log in. This is used by new Globus users and those Globus users who have come from another institution and have unlinked their previous Globus identity. We also ensure that you can see the main VT collection.
File transfer demonstration using Globus. This demo is a file transfer within one collection on one cluster. The explanation is generalized to illustrate how to perform file transfers between two clusters using different collections. This demo also shows how the directories Globus displays are the same as those under “/projects” on the ARC clusters.
Install Globus Connect Personal (GCP) on your laptop. GCP is used with Globus to enable file transfers between your local computer and ARC clusters.
File transfer demonstration using Globus and Globus Connect Personal (GCP). GCP is used with Globus to enable file transfers between your local computer and ARC clusters.
Setup Globus Guest Collections. Globus Guest Collections (GGCs) are areas of storage that are accessible to you and other people that you designate. These “others” can be colleagues at other institutions around the world, making for powerful file sharing.
Setting permissions on files within Globus collections. Just because some directory within a collection is visible to Globus does NOT mean that all of the files and directories within the collection are visible. You can still use Linux/Unix permissions to control Globus’ access to your files and directories. This video shows the issues and how you can control file and directory visibility.
Environments
Environments are collections of software that tailor an otherwise “basic” computing ecosystem into one that supports your particular computing needs for a particular type of task. You may need different environments for your different tasks. There is a series of videos here on
motivation,
how to use modules,
how to structure directories to house your virtual environments,
how to construct and use virtual environments (VEs) for
command-line execution of code and
use with applications like Jupyter notebooks.
Motivation: why we need environments.
Modules–a backbone of customizing your environments.
Ways to think about structuring the locations of virtual environments to organize the (cluster, compute node type) for which they are used.
How to create and use Conda virtual environments on Owl (and other) clusters.
How to create and use a Python pip-venv virtual environment (VE) on Owl (and other clusters).
How to create and use Python Conda virtual environments with Jupyter notebooks (through OOD [Open OnDemand]).
How to create and use Julia virtual environments on clusters.