Chapter 18 Class assignment 4
Let’s create a visualization of the association between two continuous variables. Along the way we’ll add some colours, transform axes, and change some scales.
- Load data set & packages
For this visualization we’ll make use of the
movies
dataset. We’ll have to install and load the packageggplot2movies
first. See here for an explanation of the dataset.
# install.packages("ggplot2movies")
library(tidyverse)
library(ggplot2movies)
movies <- movies %>% filter(budget > 100) # remove movies with implausible low budgets
- Get a quick overview of the dataset by running:
## # A tibble: 6 × 24
## title year length budget rating votes r1 r2 r3 r4 r5 r6
## <chr> <int> <int> <int> <dbl> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 'G' Men 1935 85 4.50e5 7.2 281 0 4.5 4.5 4.5 4.5 14.5
## 2 'Manos' … 1966 74 1.9 e4 1.6 7996 74.5 4.5 4.5 4.5 4.5 4.5
## 3 'Til The… 1997 113 2.3 e7 4.8 799 4.5 4.5 4.5 14.5 14.5 14.5
## 4 .com for… 2002 96 5 e6 3.7 271 64.5 4.5 4.5 4.5 4.5 4.5
## 5 10 Thing… 1999 97 1.6 e7 6.7 19095 4.5 4.5 4.5 4.5 4.5 14.5
## 6 100 Mile… 2002 98 1.10e6 5.6 181 4.5 4.5 4.5 4.5 14.5 24.5
## # ℹ 12 more variables: r7 <dbl>, r8 <dbl>, r9 <dbl>, r10 <dbl>, mpaa <chr>,
## # Action <int>, Animation <int>, Comedy <int>, Drama <int>,
## # Documentary <int>, Romance <int>, Short <int>
Create a scatter-plot in which you show the relationship between the budget of the movie (on the x-axis) and the rating (y-axis).
Recreate the graph, but perform a 10log transformation on the x-axis.
Add sensible breaks and labels to the x- and y-axis
Recreate the above graph, but now using
geom_hex
Under the hood,
geom_hex
is creating a fill-colour, something we can change withscale_fill_gradient
. Do so.In the variable
Action
is described whether the movie is categorized as an action movie. Create facets using this variable.Add a linear regression line.