Biomed - Biomedical research
Overview
The biomed cluster has 7 nodes, 448 CPU cores, 5 TB RAM, and 7 NVIDIA A100 GPUs. Biomed hardware is summarized in the table below.
Node Type |
CPU |
GPU |
Total |
---|---|---|---|
Chip |
- |
||
Architecture |
Zen 2 |
Zen 2 |
- |
Slurm features |
- |
- |
- |
Nodes |
6 |
1 |
7 |
GPUs |
- |
7x NVIDIA A100-80G |
7 |
Cores/Node |
64 |
64 |
- |
Memory (GB)/Node |
512 |
2,048 |
- |
Maximum Memory for Slum (GB)/Node |
495 |
2,007 |
- |
Total Cores |
384 |
64 |
448 |
Total Memory (GB) |
3,072 |
2,048 |
5120 |
Local Disk |
240GB SSD |
240GB SSD |
- |
Interconnect |
HDR-100 IB |
HDR-100 IB |
- |
Access
The biomed cluster is set up to host projects which require some computational scale but are subject to restrictions such as NIST SP 800-171 required by NIH or other agencies. Access to the biomed cluster requires approval from the Oficce of Research’s Division of Scholarly Integrity and Research Compliance and consultation with ARC personnel to set up access and provide instructions for use.
Get Started
Biomed can be accessed via the login node using your VT credentials:
biomed1.arc.vt.edu
Access is limited to university-managed devices and by authorized researchers subject to an user agreement. Access from personal devices is not permitted.
Partitions
Users submit jobs to partitions of the cluster depending on the type of resources needed (for example, CPUs or GPUs). Features are optional restrictions users can indicate in their job submission to restrict the execution of their job to nodes meeting specific requirements. If users do not specify the amount of memory requested for a job, the parameter DefMemPerCPU will automatically determine the amount of memory for the job based on the number of CPU cores requested. If the users do not specify the number of CPU cores on a GPU job, the parameter DepCpuPerGPU will automatically determine the number of CPU cores based on the number of GPUs requested. Jobs will be billed against the user’s allocation accounting for the utilization of number of CPU cores, memory, and GPU time. Consult the Slurm configuration to understand how to specify the parameters for your job.
Partition |
normal_q |
a100_normal_q |
---|---|---|
Node Type |
CPU |
GPU |
Features |
- |
- |
Number of Nodes |
6 |
1 |
DefMemPerCPU (MB) |
7920 |
32112 |
DefCpuPerGPU |
- |
8 |
PreemptMode |
OFF |
OFF |
Optimization
The performance of jobs can be greatly enhanced by appropriate optimizations being applied. Not only does this reduce the execution time of jobs but it also makes more efficient use of the resources for the benefit of all.
See the tuning guides available at https://developer.amd.com and https://www.intel.com/content/www/us/en/developer/
General principles of optimization:
Cache locality really matters - process pinning can make a big difference on performance.
Hybrid programming often pays off - one MPI process per L3 cache with 4 threads is often optimal.
Use the appropriate
-march
flag to optimize the compiled code and-gencode
flag when using the NVCC compiler.
Suggested optimization parameters:
Node Type |
CPU |
GPU |
---|---|---|
CPU arch |
Zen 2 |
Zen 2 |
Compiler flags |
|
|
GPU arch |
- |
NVIDIA A100 |
Compute Capability |
- |
8.0 |
NVCC flags |
- |
|