(globus)= # Globus ## Current VT Status As of Fall 2023, Virginia Tech has an institutional subscription with Globus coordinated through ARC. In addition, ARC has established a Globus Data Transfer Node which provides access to ```/projects``` storage. Individuals are also able to create personal Globus accounts and use Globus Connect Personal software on ARC systems. ## Make a `/projects` directory visible to Globus The `/projects` directories can be made visible ("shared") via Globus. The owner (usually the PI) of the `/projects` directory can enable sharing via ARC's [ColdFront](https://coldfront.arc.vt.edu) allocation management site. 1. The PI will login to [ColdFront](https://coldfront.arc.vt.edu), navigate to the corresponding "Project (free) (Storage)" allocation and check the box for "Share via Globus" and then click "update". This will take effect immediately. 2. Any member of the associated group can then login to Globus.org and access/manage files there. Under "File Manager" search for "Virginia Tech" and then the "Virginia Tech ARC Globus Projects Directories" collection can be selected and the shared directory should be visible. ```warning Filesets with "Lots of Small Files" (LoSF) are the worst-case scenario for most file systems and transfer tools. For stability and performance, it is vital that such LoSF filesets be packaged into archives via tools such as `tar`. Attempting transfers of LoSF filesets via Globus is known to cause very poor performance and faults such as `ENDPOINT_TOO_BUSY`. ``` ## Globus Connect Personal (GCP) GCP can be used to connect a device or storage location you own to the Globus network. For example, you can make your ```/home/``` or ```/projects/``` group-shared directory accessible to you when you log into the Globus.org web application. When you do this, it shows up in your "Collections". Then you can browse, upload, download, and coordinate transfers among other collections you have in the Globus web application. This can be a very powerful and enabling way to manage data among multiple institutions. Detailed information on using GCP is available on [Globus's website](https://docs.globus.org/how-to/globus-connect-personal-linux/). ### Using GCP on ARC Systems Here is an outline of the steps to you'll need to take to use GCP on an ARC cluster. These are derived from the more [complete instrutions](https://docs.globus.org/how-to/globus-connect-personal-linux/) provided by Globus. Connecting GCP to Globus requires and that you have an account with Globus and you will need to access their web application, so the first step is to 1. Log in to https://globus.com in a web browser. On the Tinkercliffs cluster, a software module for GCP is provided. ``` module load tinkercliffs-rome/GlobusConnectPersonal ``` By loading the module, the program `globusconnectpersonal` is made available to you, but it still needs to be configured. #### Configuring 1. From the command line on an ARC system (eg. Tinkercliffs login node), load the module and then run the command `globusconnectpersonal`. If you have not already completed configuration, then it will provide you with a URL and walk you through the next two steps. 2. Authenticate GCP client with the Globus web application by copying the provided URL into your browser. This will prompt you for some setup information and then provide an "auth code". 3. Copy the "auth code" from your browser and paste it into your the command-line shell which should have a prompt waiting for this code. 4. (optional) Edit the file `~/.globusonline/lta/config-paths` to configure which directories GCP should use and whether or not to present them as writable in the Globus system. ```{note} Any text editor can be used to modify the config file. If you don't already have a preferred command-line text editor, then `nano` may be a good choice. ``` Here is an example `config-paths` file. It is a header-less CSV (comma separated values) file. The three fields are 1. the directory (ie. "folder" or "path") to connect 2. `[0,1]` indicating whether or not "Globus sharing" is enabled. "0" is the only viable options while VT does not have an institutional subscription to Globus. 3. `0` or `1` indicating whether the directory is "not writable" or "writable", respectively in the Globus interface. ```{note} Writability in Globus also requires that the writing user actually have write permissions on the filesystem as well. So, indicating that a directory is writable for GCP does not somehow override the file/directory permissions on ARC system. ``` ``` ~/,0,1 /projects/proj_name,0,0 ``` Here, two directories (`~`, and `/projects/proj_name`) are being made available to GCP. The last field ```{note} `~` is a shortcut for `/home/` ``` ### Installing GCP on linux ```{note} These are derived from the more [complete instrutions](https://docs.globus.org/how-to/globus-connect-personal-linux/) provided by Globus. ``` 1. *Verify that you can log in to https://globus.org*. If you do not already have a Globus account, you will need to create one. 2. *Download and extract the latest GCP, then run the setup.* The `ls` command is needed to determine the version number you have downloaded which you must specify to `cd` to the correct directory: ``` # Download latest GCP wget https://downloads.globus.org/globus-connect-personal/linux/stable/globusconnectpersonal-latest.tgz # Extract the compressed tar file tar xzf globusconnectpersonal-latest.tgz # Determine the name and version of the extracted directory ls -ld globusconnect* # Change directory to the newly extracted on cd globusconnectpersonal-__.__.__ # This will run the GCP setup if you have not already done so ./globusconnecpersonal ``` 3. *Authenticate the GCP client with the Globus website.* The last step above should have provided a URL for you to copy-paste into a web browser. Navigating to that URL will connect the GCP you have installed with the Globus web app. ```Login here: ----- https://auth.globus.org/v2/oauth2/authorize?client_id=4d6448ae-8ca0-40e4-aaa9-8ec8e8320621&redirect_uri=https%3A%2F%2Fauth.globus.org%2Fv2%2Fweb%2Fauth-code&scope=openid+profile+urn%3Aglobus%3Aauth%3Ascope%3Aauth.globus.org%3Aview_identity_set+urn%3Aglobus%3Aauth%3Ascope%3Atransfer.api.globus.org%3Agcp_install&state=_default&response_type=code&code_challenge=XXXXXXXXXXXX--YYYYYYYYYYYYYYYYYYYYYYYYYYYYY&code_challenge_method=S256&access_type=online&prefill_named_grant=tinkercliffs2 ----- Enter the auth code: ``` 4. *Complete the authentication.* Review the details at the page loaded by that URL, configure as desired, and you will be provided with an "auth code" when complete. Copy that from your browser and paste it into the shell which has prompted for this and is awaiting your input. ``` ----- Enter the auth code: ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ == starting endpoint setup Input a value for the Endpoint Name: tc2 registered new endpoint, id: 5874dee8-edcf-11ed-9bb3-c9bb788c490e setup completed successfully Will now start globusconnectpersonal in GUI mode Graphical environment not detected To launch Globus Connect Personal in CLI mode, use globusconnectpersonal -start Or, if you want to force the use of the GUI, use globusconnectpersonal -gui ``` 5. *Start the client to make your files available to you in the Globus web app. ``` globusconnectpersonal -start ``` 6. (optional) *Edit the configuration to add other directories and set permissions*