(anaconda)= # Conda ## Why miniconda/miniforge and not anaconda? While Anaconda includes the `conda` software and the `conda-forge` channel, which are open-source and freely licensed, the `defaults` channel is subject to paid licenses ([see terms](https://www.anaconda.com/blog/is-conda-free#summarize)). To protect VT's research community, ARC does not provide the Anaconda package. You may install Anaconda into your home but we recommend removing the `defaults` channel. Use `module spider miniconda` or `module spider miniforge` to search our module system for the most recent Miniconda/Miniforge available. Read the instructions on [how to build a virtual environment using miniforge](conda-virt-envs). ## Create a virtual environment specifically for the type of node where it will be used Each cluster has at least two different node types. Each node type is equipped with a different cpu micro-architecture, slightly different operating system and/or kernel versions, slightly different system configuration and packages. All are tuned to be customized and efficient for the particular node features. These system differences can make virtual environments non-portable between node types. As a result, you should create and build a virtual environment on a node of the type where you will use the environment. ### Example: If you want to use a conda environment on Tinkercliffs `a100_normal_q` nodes, then you need to build the environment from a shell on those nodes. The important commands for this are: |command |purpose| |-|-| |`interact`|get an interactive command line shell on a compute node| |`module spider`|search for the latest anaconda module| |`module load`|load a module| |`conda create -p $HOME/envname`|create a new anaconda environment at the provided path| |`source activate $HOME/envname`|activate the newly created environment| |`conda install ...`|install packages into the environment| ```{note} $HOME "expands" in the shell to your home directory, eg. `/home/jdoe2`. And `envname` from above should be a short but meaninful name for the environment. Since they are particular to the node type, it is recommended to reference the node type in the name. For example `tca100-science` or `tcnq` for Tinkercliffs `a100_normal_q` nodes or Tinkercliffs `normal_q` nodes respectively. ``` Use `conda env list` to view conde environments and their absolute paths. Use `conda list` to view the packages and versions in the currently activated environment. ``` [jdoe2@tinkercliffs2 ~]$ interact --partition=a100_normal_q --nodes=1 --ntasks-per-node=4 --gres=gpu:1 --account=jdoeacct srun: job 2920919 queued and waiting for resources srun: job 2920919 has been allocated resources [jdoe2@tc-gpu001 ~]$ module spider miniconda ----------------------------------------------------------------------------------------------------------- Miniconda3: Miniconda3/24.7.1-0 ----------------------------------------------------------------------------------------------------------- Description: Miniconda is a free minimal installer for conda. It is a small, bootstrap version of Anaconda that includes only conda, Python, the packages they depend on, and a small number of other useful packages. This module can be loaded directly: module load Miniconda3/24.7.1-0 Help: Description =========== Miniconda is a free minimal installer for conda. It is a small, bootstrap version of Anaconda that includes only conda, Python, the packages they depend on, and a small number of other useful packages. More information ================ - Homepage: https://docs.conda.io/en/latest/miniconda.html [jdoe2@tc-gpu001 ~]$ module load Miniconda3/24.7.1-0 [jdoe2@tc-gpu001 ~]$ conda create -p ~/.conda/envs/a100_env python=3.11 ... Preparing transaction: done Verifying transaction: done Executing transaction: done # # To activate this environment, use # # $ conda activate /home/jdoe2/env/a100_env # # To deactivate an active environment, use # # $ conda deactivate [jdoe2@tc-gpu001 ~]$ source activate /home/jdoe2/.conda/envs/a100_env/ (a100_env) [jdoe2@tc-gpu001 ~]$ conda install matplotlib Proceed ([y]/n)? y Downloading and Extracting Packages: Preparing transaction: done Verifying transaction: done Executing transaction: done ``` ## Using kernels with an Environment You can use a Jupyter kernel to use a virtual environment inside a Jupyter notebook. Each kernel can be used to run different cells according to its language/package requirements. For example, if you have a notebook that uses two different sets of packages where each set is installed in a different conda environment, then you can use Jupyter kernels to switch between those two sets of packages. To start a kernel that is associated with a specific environment, activate the environment and install ipykernel inside that environment: ``` [jdoe2@tinkercliffs2 ~]$ module load Miniconda3 [jdoe2@tinkercliffs2 ~]$ source activate /home/jdoe2/.conda/envs/a100_env/ [jdoe2@tinkercliffs2 ~]$ conda install ipykernel (a100_env) [jdoe2@tinkercliffs2 ~]$ python -m ipykernel install --user --name a100_env --display-name "Python (a100_env)" Installed kernelspec a100_env in /home/jdoe2/.local/share/jupyter/kernels/a100_env ``` Then, when launching the Jupyter interactive app from [Open OnDemand](https://ood.arc.vt.edu/pun/sys/dashboard), you can start a kernel in the environment created before. From the top menu, select *Kernel -> Change kernel -> Python (a100_env), then execute your cell. ## GPU - CUDA compatibility While `nvidia-smi` will display a version of CUDA, this is just the base CUDA on the node and can be overridden by - loading a different CUDA module: `module spider cuda` - activating an Anaconda environment which has cudatoolkit `conda list cudatoolkit` - installing a conda package built with a different cuda: `conda list tensorflow` -> check the build string ### Check CUDA and cuDNN version in Tensorflow ``` import tensorflow as tf # This will print out various build and version information print("TensorFlow version:", tf.__version__) print("CUDA built with:", tf.sysconfig.get_build_info()["cuda_version"]) print("cuDNN built with:", tf.sysconfig.get_build_info()["cudnn_version"]) ``` ### Check CUDA version in PyTorch ``` import torch # Print PyTorch version and the CUDA version it was compiled with print("PyTorch version:", torch.__version__) print("CUDA version (compiled):", torch.version.cuda) # Check if CUDA is available and the runtime version print("Is CUDA available?", torch.cuda.is_available()) if torch.cuda.is_available(): print("CUDA runtime version (from driver):", torch.cuda.get_device_properties(0).major, ".", torch.cuda.get_device_properties(0).minor) ```