2.4 R startup

Every time R starts a number of things happen. It can be useful to understand this startup process, so you can make R work the way you want it, fast. This section explains how.

2.4.1 R startup arguments

The arguments passed to the R startup command (typically simply R from a shell environment) determine what happens. The following arguments are particularly important from an efficiency perspective:

  • --no-environ tells R to only look for startup files in the current working directory. (Do not worry if you don’t understand what this means at present: it will become clear as the later in the section.)

  • --no-restore tells R not to load any .RData files knocking around in the current working directory.

  • --no-save tells R not to ask the user if they want to save objects saved in RAM when the session is ended with q().

Adding each of these will make R load slightly faster, and mean that slightly less user input is needed when you quit. See An Introduction to R, Appendix B, for more startup arguments.

Some of R’s startup arguments can be controlled interactively in RStudio. See the online help file Customizing RStudio for more on this.

2.4.2 An overview of R’s startup files

There are two special files, .Rprofile and .Renviron, which determine how R performs for the duration of the session. These are summarised in the bullet points below we go into more detail on each in the subsequent sections.

  • .Rprofile is a plain text file (which is always called .Rprofile, hence its name) that simply runs lines of R code every time R starts. If you want R to check for package updates each time it starts (as explained in the previous section), you simply add the relevant line somewhere in this file.

  • The primary purpose of .Renviron is to set environment variables. These are settings that relate to the operating system for telling where to find external programs and the contents of user-specific variables that other users should not have access to such as API key, small text strings used to verify the user when interacting web services.

2.4.3 The location of startup files

Confusingly, multiple versions of these files can exist on the same computer, only one of which will be used per session. Note also that these files should only be changed with caution and if you know what you are doing. This is because they can make your R version behave differently to other R installations, potentially reducing the reproducibility of your code.

Files in three folders are important in this process:

  • R_HOME, the directory in which R is installed. The etc sub-directory can contain start-up files read early on in the start-up process. Find out where your R_HOME is with the R.home() command.

  • HOME, the user’s home directory. Typically this is /home/username on Unix machines or C:\Users\username on Windows (since Windows 7). Ask R where your home directory with, Sys.getenv("HOME").

  • R’s current working directory. This is reported by getwd().

It is important to know the location of the .Rprofile and .Renviron set-up files that are being used out of these three options. R only uses one .Rprofile and one .Renviron in any session: if you have a .Rprofile file in your current project, R will ignore .Rprofile in R_HOME and HOME. Likewise, .Rprofile in HOME overrides .Rprofile in R_HOME. The same applies to .Renviron: you should remember that adding project specific environment variables with .Renviron will de-activate other .Renviron files.

To create a project-specific start-up script, simply create a .Rprofile file in the project’s root directory and start adding R code, e.g. via file.edit(".Rprofile"). Remember that this will make .Rprofile in the home directory be ignored. The following commands will open your .Rprofile from within an R editor:

file.edit(file.path("~", ".Rprofile")) # edit .Rprofile in HOME
file.edit(".Rprofile") # edit project specific .Rprofile

Note that editing the .Renviron file in the same locations will have the same effect. The following code will create a user specific .Renviron file (where API keys and other cross-project environment variables can be stored), without overwriting any existing file.

user_renviron = path.expand(file.path("~", ".Renviron"))
if(!file.exists(user_renviron)) # check to see if the file already exists
  file.create(user_renviron)
file.edit(user_renviron) # open with another text editor if this fails

The pathological package can help find where .Rprofile and .Renviron files are located on your system, thanks to the os_path() function. The output of example(startup) is also instructive.

The location, contents and uses of each is outlined in more detail below.

2.4.4 The .Rprofile file

By default, R looks for and runs .Rprofile files in the three locations described above, in a specific order. .Rprofile files are simply R scripts that run each time R runs and they can be found within R_HOME, HOME and the project’s home directory, found with getwd(). To check if you have a site-wide .Rprofile, which will run for all users on start-up, run:

site_path = R.home(component = "home")
fname = file.path(site_path, "etc", "Rprofile.site")
file.exists(fname)

The above code checks for the presence of Rprofile.site in that directory. As outlined above, the .Rprofile located in your home directory is user-specific. Again, we can test whether this file exists using

file.exists("~/.Rprofile")

We can use R to create and edit .Rprofile (warning: do not overwrite your previous .Rprofile - we suggest you try project-specific .Rprofile first):

if(!file.exists("~/.Rprofile")) # only create if not already there
  file.create("~/.Rprofile")    # (don't overwrite it)
file.edit("~/.Rprofile")

2.4.5 Example .Rprofile settings

Example contents of short and simple .Rprofile are illustrated below, with comments explaining what each line does. More details on these, and other potentially useful .Rprofile options are described subsequently.

# A fun welcome message
message("Hi Robin, welcome to R")
# Customise the R prompt that prefixes every command
# (use " " for a blank prompt)
options(prompt = "R4geo> ")
# Don't convert text strings to factors with base read functions
options(stringsAsFactors = FALSE)

For more suggestions of useful startup settings, see Examples in help("Startup") and online resources such as those at statmethods.net.

Ever been frustrated by unwanted + symbols that prevent copyied and pasted multi-line functions from working? These potentially annoying +s can be erradicated by adding options(continue = " ") to your .Rprofile.

2.4.5.1 Setting options

The function options, used above, contains a number of default settings. See help("options") or simply type options() to get an idea of what we can configure. Because options() are often related to personal preference (with few implications for reproducibility), that you will want for all your R sessions, .Rprofile is a good place to set them. Other illustrative options are shown below:

options(prompt="R> ", digits=4, show.signif.stars=FALSE)

This changes three features.

  • The R prompt, from the boring > to the exciting R>.
  • The number of digits displayed.
  • Removing the stars after significant \(p\)-values.

Try to avoid adding options to the start-up file that make your code non-portable. The stringsAsFactors = FALSE argument used above, for example, to your start-up script has knock-on effects for read.table and related functions including read.csv, making them convert text strings into characters rather than into factors as is default. This may be useful for you, but can make your code less portable, so be warned.

2.4.5.2 Setting the CRAN mirror

To avoid setting the CRAN mirror each time you run install.packages you can permanently set the mirror in your .Rprofile.

# `local` creates a new, empty environment
# This avoids polluting .GlobalEnv with the object r
local({
  r = getOption("repos")             
  r["CRAN"] = "https://cran.rstudio.com/"
  options(repos = r)
})

The RStudio mirror is a virtual machine run by Amazon’s EC2 service, and it syncs with the main CRAN mirror in Austria once per day. Since RStudio is using Amazon’s CloudFront, the repository is automatically distributed around the world, so no matter where you are in the world, the data doesn’t need to travel very far, and is therefore fast to download.

2.4.5.3 The fortunes package

This section illustrate what .Rprofile does with reference to a package that was developed for fun. The code below could easily be altered to automatically connect to a database, or ensure that the latest packages have been downloaded.

The fortunes package contains a number of memorable quotes that the community has collected over many years, called R fortunes. Each fortune has a number. To get fortune number \(50\), for example, enter

fortunes::fortune(50)

It is easy to make R print out one of these nuggets of truth each time you start a session, by adding the following to ~/.Rprofile:

if(interactive()) 
  try(fortunes::fortune(), silent=TRUE)

The interactive function tests whether R is being used interactively in a terminal. The fortune function is called within try. If the fortunes package is not available, we avoid raising an error and move on. By using :: we avoid adding the fortunes package to our list of attached packages.

Typing search(), gives the list of attached packages. By using fortunes::fortune() we avoid adding the fortunes package to that list.

The function .Last, if it exists in the .Rprofile, is always run at the end of the session. We can use it to install the fortunes package if needed. To load the package, we use require, since if the package isn’t installed, the require function returns FALSE and raises a warning.

.Last = function() {
  cond = suppressWarnings(!require(fortunes, quietly=TRUE))
  if(cond) 
    try(install.packages("fortunes"), silent=TRUE)
  message("Goodbye at ", date(), "\n")
}

2.4.5.4 Useful functions

You can use .Rprofile define new ‘helper’ functions or redefine existing ones so they’re faster to type. For example, we could load the following two functions for examining data frames:

# ht == headtail
ht = function(d, n=6) rbind(head(d, n), tail(d, n))
# Show the first 5 rows & first 5 columns of a data frame
hh = function(d) d[1:5, 1:5]

and a function for setting a nice plotting window:

setnicepar = function(mar = c(3, 3, 2, 1), mgp = c(2, 0.4, 0), 
                      tck = -0.01, cex.axis = 0.9, 
                      las = 1, mfrow = c(1, 1), ...) {
    par(mar = mar, mgp = mgp, tck = tck, cex.axis = cex.axis, 
        las = las, mfrow = mfrow, ...)
}

Note that these functions are for personal use and are unlikely to interfere with code from other people. For this reason even if you use a certain package every day, we don’t recommend loading it in your .Rprofile. Shortening long function names for interactive (but not reproducible code writing). If you frequently use View(), for example, you may be able to save time by referring to it in abbreviated form. This is illustrated below to make it faster to view datasets (although with IDE-driven autocompletion, outlined in the next section, the time savings is less.)

v = utils::View

Also beware the dangers of loading many functions by default: it may make your code less portable. Another downside of putting functions in your .Rprofile is that it can clutter-up your work space: when you run the ls() command, your .Rprofile functions will appear. Also if you run rm(list=ls()), your functions will be deleted.

One neat trick to overcome this issue is to use hidden objects and environments. When an object name starts with ., by default it doesn’t appear in the output of the ls() function

.obj = 1
".obj" %in% ls()
#> [1] FALSE

This concept also works with environments. In the .Rprofile file we can create a hidden environment

.env = new.env()

and then add functions to this environment

.env$ht = function(d, n = 6) rbind(head(d, n), tail(d, n))

At the end of the .Rprofile file, we use attach, which makes it possible to refer to objects in the environment by their names alone.

attach(.env)

2.4.6 The .Renviron file

The .Renviron file is used to store system variables. It follows a similar start-up routine to the .Rprofile file: R first looks for a global .Renviron file, then for local versions. A typical use of the .Renviron file is to specify the R_LIBS path, which determines where new packages are installed:

# Linux
R_LIBS=~/R/library
# Windows
R_LIBS=C:/R/library

After setting this, install.packages saves packages in the directory specified by R_LIBS. The location of this directory can be referred back to subsequently as follows:

Sys.getenv("R_LIBS_USER")

All currently stored environment variables can be seen by calling Sys.getenv() with no arguments. Note that many environment variables are already pre-set and do not need to be specified in .Renviron. HOME, for example, which can be seen with Sys.getenv('HOME'), is taken from the operating system’s list of environment variables. A list of the most important environment variables that can affect R’s behaviour is documented in the little known help page help("environment variables").

To set or unset environment variable for the duration of a session, use the following commands:

Sys.setenv("TEST" = "test-string") # set an environment variable for the session
Sys.unsetenv("TEST") # unset it

Another common use of .Renviron is to store API keys and authentication tokens that will be available from one session to another.3 A common use case is setting the ‘envvar’ GITHUB_PAT, which will be detected by the devtools package via the fuction github_pat(). To take another example, the following line in .Renviron sets the ZEIT_KEY environment variable which is used in the diezeit package:

ZEIT_KEY=PUT_YOUR_KEY_HERE

You will need to sign-in and start a new R session for the environment variable (accessed by Sys.getenv) to be visible. To test if the example API key has been successfully added as an environment variable, run the following:

Sys.getenv("ZEIT_KEY")

Use of the .Renviron file for storing settings such as library paths and API keys is efficient because it reduces the need to update your settings for every R session. Furthermore, the same .Renviron file will work across different platforms so keep it stored safely.

2.4.6.1 Example .Renviron file

My .Renviron file has grown over the years. I often switch between my desktop and laptop computers, so to maintain a consistent working environment, I have the same .Renviron file on all of my machines. As well as containing an R_LIBS entry and some API keys, my .Renviron has a few other lines:

  • TMPDIR=/data/R_tmp/. When R is running, it creates temporary copies. On my work machine, the default directory is a network drive.

  • R_COMPILE_PKGS=3. Byte compile all packages (covered in Chapter 3).

  • R_LIBS_SITE=/usr/lib/R/site-library:/usr/lib/R/library I explicitly state where to look for packages. My University has a site-wide directory that contains out of date packages. I want to avoiding using this directory.

  • R_DEFAULT_PACKAGES=utils,grDevices,graphics,stats,methods. Explicitly state the packages to load. Note I don’t load the datasets package, but I ensure that methods is always loaded. Due to historical reasons, the methods package isn’t loaded by default in certain applications, e.g. Rscript.

Exercises

  1. What are the three locations where the startup files are stored? Where are these locations on your computer?

  2. For each location, does a .Rprofile or .Renviron file exist?
  3. Create a .Rprofile file in your current working directory that prints the message Happy efficient R programming each time you start R at this location.
  4. What happens to the startup files in R_HOME if you create them in HOME or local project directories?


  1. See vignette("api-packages") from the httr package for more on this.