Chapter 1 R, Rstudio, and packages

Please download the most recent version of R (R Core Team 2018). RStudio will make your life much easier, so please download that too. This website was made in RStudio via R Markdown (Allaire et al. 2018), knitr (Xie 2015), and bookdown (Xie 2018).

1.1 R

R is "a freely available language and environment for statistical computing and graphics which provides a wide variety of statistical and graphical techniques". R is thus a programming language but also an environment in which you can code and do some fancy statistical and graphical things (similar to, for instance, SPSS, Excel, or Stata).

There are major advantages of using R over other statistical packages/programs, including: a) it is free, b) it is extremely flexible, c) it typically includes most up-to-date and novel statistical techniques, d) it has amazing visualisation capabilities. The downside is that R is reasonably hard to learn. R is made for statical programming and reproducible "behaviour", not for ease of use for beginners. Particularly in the past, getting your data into R, and calculating some group means or correlations would be a significant task compared to the few clicks in SPSS. Luckily, the times, they are changing, and many developments make R much more easy to use and friendly for beginners. There is no doubt in our minds that the costs of learning R relative to other programs are well worth it. Even more so in this day and age where there is a major push for every part of the scientific process to be open and can be reproduced by others.

1.1.1 The R-environment

The R-environment looks very different to most other statistical software packages. This is mainly due to fact that R is "command-based" rather than "point-and-click"; this means that you have to tell R what to do with written commands, and the language that you will use for that is the R-language. In essence, you are programming when using R. The things you can do in R are near infinite. Below are two simple examples; 1) we can tell R to do add 2 and 2 (which we can do in a way that is similar to 'natural' language), 2) we can generate 100 numbers from a normal distribution with a mean of 175 and a standard deviation of 12 (more or less the distribution of female heights (in cm) in the Netherlands) using the "rnorm" function. This does not seem particularly useful perhaps, but the ability to generate numbers can come in handy during statistical analyses.

The R-environment

The R-environment

1.2 RStudio

Rather than working directly in the R-environment, we'll be working in the RStudio environment. RStudio is a free environment for working with R, that makes programming in R much, much easier. It takes away many frustrations that one might have when working in the R-environment. I myself haven't worked in the R-environment since I discovered RStudio. An added bonus of RStudio('s developers) is that they make many great, easy to use packages for R. In summary, download RStudio (do note that you ALSO need to have R downloaded and installed).

The RStudio-environment

The RStudio-environment

1.2.1 Rprojects

Starting an Rproject for each project you do is handy. An Rproject in essence is a folder with all the relevant R-files and datasets. Keeping everything in one folder is an easy way to reduce frustration with paths to files and finding the plots or scripts you have saved

Starting an R-project

Starting an R-project

1.3 Installing packages

R works with "packages". That means that for some functionality to work, you need to install those packages first. If you want to make use of them during your R-session, you will also need to tell R that you will be using them. You can copy all code in this document to your own R/Rstudio terminal. All the text after the hasthag "#" are comments, and will not be run by R as code.

We'll be installing the tidyverse-package, which is actually a collection of very useful packages (https://www.tidyverse.org/packages/). This may take some time.

install.packages("tidyverse") # installs the tidyverse-package

We also need to run the packages, to let R know that we will use them in this session.

library("tidyverse") # tell R you will use the functionality
## ── Attaching packages ─────────────────────────────────── tidyverse 1.3.2 ──
## ✔ tibble  3.1.8      ✔ dplyr   1.0.10
## ✔ tidyr   1.2.1      ✔ stringr 1.4.1 
## ✔ readr   2.1.3      ✔ forcats 0.5.2 
## ✔ purrr   0.3.5      
## ── Conflicts ────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()

1.3.1 References

R Core Team. 2018. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.

Allaire, JJ, Yihui Xie, Jonathan McPherson, Javier Luraschi, Kevin Ushey, Aron Atkins, Hadley Wickham, Joe Cheng, and Winston Chang. 2018. Rmarkdown: Dynamic Documents for R. https://CRAN.R-project.org/package=rmarkdown.

Xie, Yihui. 2015. Dynamic Documents with R and Knitr. 2nd ed. Boca Raton, Florida: Chapman; Hall/CRC. http://yihui.name/knitr/.

Xie, Yihui. 2018. Bookdown: Authoring Books and Technical Documents with R Markdown. https://CRAN.R-project.org/package=bookdown.