Skip to content

R and RStudio#

R is a free software environment for statistical computing and graphics. RStudio is an IDE for R.

Availability / Target HPC systems#

For standalone-simulations, the following HPC systems are best suited:

  • JupyterHub: best suited for interactive use of RStudio (only for Tier3 HPC accounts)
  • Woody: best suited for smaller calculations
  • TinyFat: for calculations with large memory requirements

Different versions of R are available via environment modules. They may also vary between the clusters. All available versions can be listed via module avail r.

Notes#

  • As some compute nodes cannot access the internet directly, you might have to configure a proxy server, e.g. for package installation. Use the commands Sys.setenv(http_proxy="http://proxy:80") and Sys.setenv(https_proxy="http://proxy:80") in your R command line.

  • We used to provide the Microsoft R Open distribution (modules r/xxx-mro) as this distribution internally uses Intel MKL for some compute intensive routines to improve performance. As Microsoft stopped their R distribution, we switched to Conda (modules r/xxx-conda).

RStudio on JupyterHub#

Interactive usage of RStudio is available as a custom kernel in JupyterHub. The following steps are necessary:

  1. Access JupyterHub according to this guide.
  2. Select Rocker/RStudio from the job profiles. RStudio can either run locally on JupyterHub or as a Slurm job on a compute node.

    JupyterHub-job-profile

  3. Choose RStudio from the available notebook types.

    JupyterHub-RStudio

  4. You can now use RStudio.

    JupyterHub-RStudio

  5. After you finished your work, remember to stop your instance manually by going back to the hub control panel (File > Hub Control Panel) and selecting Stop My Server. Closing the browser or logging out from JupyterHub will NOT free the resources!

    JupyterHub-stop-server

RStudio on woody#

The Open Source Edition of RStudio Server is not able to handle multiple users; thus, we cannot provide a central RStudio Server.

However, a single-user instance of RStudio Server can be run through Apptainer on Memoryhog, a Woody, TinyFat or TinyGPU compute node. The setup is based on the Rocker Project. It can be used in the following way:

  • Start an interactive job on a Woody, TinyFat or TinyGPU compute node or connect directly to Memoryhog.
  • Execute the script under /apps/rstudio/start-rocker-rstudio.sh to start RStudio Server.
  • The script will tell you how to forward the required port to your local machine and the access credentials.
  • Use RStudio Server interactively.
  • Once you are finished, don't forget to kill the server and job!