An R reproducibility toolkit for the practical researcher > Day 0
Day 0
By Paola Corrales and Elio Campitelli
Hi! In this workshop we will be introducing in many tools to help make your work as reproducible as possible. That means that you’ll need to have several programs installed and set up before we begin. Please, follow these instructions before the workshop.
R
At the very least you need to have R installed on your computer. If possible, try to have it upgraded to the latest version. We developed the materials using R version 4.4.1 (2024-06-14). To the best of our abilities, we don’t think there’s anything that requires this particular version, but to avoid potential issues, it would be best if you have at least a R 4.something.
Go to https://cloud.r-project.org/ and follow the steps for your Operating System.
RTools (only on Windows)
If you are using Windows, you also will need RTools to compile source packages. Go to https://cran.r-project.org/bin/windows/Rtools/ and download and install the version of RTools for your version of R by following the instructions.
RStudio
We will be using RStudio as IDE. Again, try to update it to the latest version. The new visual editor will be particularly helpful for those without a lot of experience with markdown (and probably for the markdown experts as well).
Go to https://posit.co/download/rstudio-desktop/ and download the latest version.
If you prefer to use a different IDE, you are welcome to do it.
(Optional) Use RStudio Package Manager
If you’re a Linux user, you can add the RStudio Package Manager repository. This can speed up package installation tremendously, as it provides pre-compiled versions of R packages. To enable it, follow the instructions in their setup guide and be sure to select your Linux distribution to get the binary packages.
R Packages
You will need to install the packages required to use knitr. The easiest way to do this is to let RStudio do that for you.
Open RStudio
Go to to File -> New File -> R Markdown.
If you get a pop up saying that you need to install some packages, click “Ok”. RStudio will install the packages and then open a sample document.
Click on the “Knit” button (the blue yarn) to test your installation; everything went well, you should see a new window with an HTML file.
Other important packages are here, renv, rticles, and devtools and roxygen2. These are on CRAN so you can install them with the usual
install.packages(c("here", "renv", "rticles", "devtools", "roxygen2", "gitcreds"))
The package rrtools will also be useful. It’s not on CRAN, so you need so install it from github:
if (!require("remotes")) install.packages("remotes")
remotes::install_github("benmarwick/rrtools")
On Linux, you might need to install these system libraries:
sudo apt install libxml2-dev libcurl4-openssl-dev libfontconfig1-dev libgit2-dev libssh2-1-dev libharfbuzz-dev libfribidi-dev lbfreetype6-dev libpng-dev libtiff5-dev libjpeg-dev
LaTeX
You’ll need a working LaTeX installation to render PDFs. If you didn’t feel chills down your spine at the sight of that sentence, consider yourself lucky!
The easiest (and painless) way to do this is to use the tinytex R package. You should be up and running with these two commands:
install.packages('tinytex')
tinytex::install_tinytex()
Git and GitHub
On the second day of the workshop we will start using Git for version control and GitHub as a remote repository. Details on how to do this are best described in Jenny Bryan’s fantastic Happy Git with R book. The core steps are:
Install git on your machine.
The instructions depend on your Operating System. Follow the instructions at https://happygitwithr.com/install-git.html.Create a GitHub account.
Go to github.com and create your free account. If it asks you to personalise your experience, scroll to the bottom of the page to skip that step.Introduce yourself to git by running this from the R console, replacing the name and email with your name and the email you used to register to GitHub
usethis::use_git_config(user.name = "Jane Doe", user.email = "jane@example.org")
If you are on linux, you’ll need to configure git to save credentials for more than 15 minutes. Follow the instructions in this post by Danielle Navarro. It covers some of the same territory with more detail, and it can be cathartic to read someone who shares your pain.
Set up git so it can work with GitHUb and RStudio.
This can be, unfortunately, a rather involved process so be patient. Again, the excellent Happy Git with R is your friend. Follow the instructions in this chapter to set up your GitHub Personal Access Token.Create a Personal Access Token with
usethis::create_github_token()
This will create a token. Do now close this window, as you will need to copy the token to your clipboard. Then run this other command
gitcreds::gitcreds_set()
This will ask you for your access token. Go to the previous window, copy the token, paste it on your console and press enter.
Finally, to check that everything’s ok with this command.
usethis::gh_token_help()
It should print a list of personal details, including:
Personal access token for 'https://github.com': '<discovered>'
(Optional) In this workshop we will be using RStudio as a simple git GUI, but in the future you might want to use a more full-fledged client. Again, Happy Git with R recommends a few. We like GitKraken, which is free, feature-rich, and multi-platform.
To check whether the git and GitHub configuration is ready you can run the function usethis::git_sitrep()
in the console. You’ll see something similar to this:
── Git global (user)
• Name: 'Your Name'
• Email: 'your.email@gmail.com'
• Global (user-level) gitignore file: <unset>
• Vaccinated: FALSE
ℹ See `?git_vaccinate` to learn more
• Default Git protocol: 'https'
• Default initial branch name: 'master'
── GitHub user
• Default GitHub host: 'https://github.com'
• Personal access token for 'https://github.com': '<discovered>'
• GitHub user: 'yourusername'
• Token scopes: 'admin:org_hook, admin:public_key, admin:repo_hook, gist, repo, user, workflow, write:discussion'
• Email(s): 'your.email@gmail.com (primary)', 'username@users.noreply.github.com'
ℹ No active usethis project
This means that git knows you and that it is connected to your GitHub account.
Docker
On the fourth day we show the basics of docker. Instructions on how to install it depend on your operating system.
If you’re on Windows, you need to enable hardware virtualisation. This needs to be enabled on the BIOS (how to do this varies between manufacturers) and as a Windows’ setting. Search for “Turn Windows Features on and off” and make sure that “Virtual Machine Platform”, “Windows Hypervisor Platform” and “Windows Subsystem for Linux” are enabled. For more details, check this documentation.
Go to https://docs.docker.com/get-docker/ and follow the instructions.
If you’re on Linux, you need sudo
to run docker unless you follow this guide, which is recommended.
Regardless of your OS, run this line to check that your docker installation is working.
You should see a message like this one:
docker run hello-world
Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world
2db29710123e: Pull complete
Digest: sha256:18a657d0cc1c7d0678a3fbea8b7eb4918bba25968d3e1b0adebfa71caddbc346
Status: Downloaded newer image for hello-world:latest
Hello from Docker!
This message shows that your installation appears to be working correctly.
To generate this message, Docker took the following steps:
1. The Docker client contacted the Docker daemon.
2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
(amd64)
3. The Docker daemon created a new container from that image which runs the
executable that produces the output you are currently reading.
4. The Docker daemon streamed that output to the Docker client, which sent it
to your terminal.
To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash
Share images, automate workflows, and more with a free Docker ID:
https://hub.docker.com/
For more examples and ideas, visit:
https://docs.docker.com/get-started/
Once all is installed correctly, run this line to download the docker image we will use in the course so that you don’t need to download it during the session:
docker pull rocker/rstudio:4.1.2
You should get a few progress bars and finally a message like this one:
4.1.2: Pulling from rocker/rstudio
Digest: sha256:a8708ac1c1de0814c475f08034b95b8fe17f96e370b0bbfa306304c99010db49
Status: Image is up to date for rocker/rstudio:4.1.2
docker.io/rocker/rstudio:4.1.2