Homework 2

Psychology 2812B FW22

Weekly homework assignments are comprised of two components: a Lab Component that your TA will guide you through in the weekly lab session, and a Home Component that you are to complete on your own. You must hand in both components. Both will count towards your grade.

Submit homework on OWL by 5:00 pm London ON time on the date shown in the Class Schedule.

Submit your homework assignment as a single RMarkdown file, using your last name and the homework assignment as a filename in the following format: gribble_n.Rmd where n is the homework assignment number.

Here is the R Markdown template file for this assignment: lastname_2.Rmd.


Lab Component

1. Install/load the penguins dataset

In RStudio install the palmerpenguins dataset (you will only have to do this once)

install.packages("palmerpenguins")

Load the palmerpenguins dataset into memory (to access the penguins dataset you will have to do this each time you start RStudio)

library("palmerpenguins")

Have a look at the first few rows of the data table

penguins
# A tibble: 344 × 8
   species island    bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
   <fct>   <fct>              <dbl>         <dbl>             <int>       <int>
 1 Adelie  Torgersen           39.1          18.7               181        3750
 2 Adelie  Torgersen           39.5          17.4               186        3800
 3 Adelie  Torgersen           40.3          18                 195        3250
 4 Adelie  Torgersen           NA            NA                  NA          NA
 5 Adelie  Torgersen           36.7          19.3               193        3450
 6 Adelie  Torgersen           39.3          20.6               190        3650
 7 Adelie  Torgersen           38.9          17.8               181        3625
 8 Adelie  Torgersen           39.2          19.6               195        4675
 9 Adelie  Torgersen           34.1          18.1               193        3475
10 Adelie  Torgersen           42            20.2               190        4250
# ℹ 334 more rows
# ℹ 2 more variables: sex <fct>, year <int>

Also have a read through the documentation for the dataset by typing ?penguins in RStudio (it will show the documentation in the Help panel).

2. Scatterplot

Make a plot of flipper length as a function of body mass. Use the theme_bw() theme. Make the labels on the x-axis and y-axis more human friendly than the default column names in the data tibble—use the labs() instruction with x=" " and y=" " arguments to specify labels for the x- and y-axes.

don’t forget to load the tidyverse into memory each time you startup RStudio, using library(tidyverse)

3. Colour by Species

Re-plot using different colours to code the three different pengiun species.

Home Component

4. Smoothing lines

Re-plot and add a smoother line using geom_smooth()

5. Linear fit

Re-plot using a linear fit smoothing line. Hint: the geom_smooth() function takes a method argument and if you give it method = "lm" it will use a linear fit.

6. Facet by island

Re-plot using different facets for each island, with the facets arranged along 1 row. Hint: use the facet_wrap() function with nrow = 1.

7. Histogram

Generate a histogram of flipper lengths. Hint: there is a geometric object called geom_histogram(). Set the x label to Flipper Length (mm) (hint: there is a layer instruction for labels called labs()):

8. Colour by species

Replot the histogram above but specify that the fill color of the bars should be coded by species (fill = species).

Hint: this is an aesthetic mapping (a mapping of variables in the dataset onto different aesthetic qualities of the plot) and so you will have to put the fill = species instruction within an aes() function call. Where should this aesthetic mapping live? Play around with different possibilities. Can you simply add fill = species to an aes() call that lives within the call to ggplot()? Could you instead add a new aes() call within the geom_histogram() instruction? Decide on one solution and write a sentence explaining your choice.

Specify the alpha transparency of the bars to be alpha = 0.8 so that you can see the different colors overlapping.

9. Boxplot

Here is some code that layers a box plot of flipper lengths across species together with jittered points of each penguin. Study each component and play with it to understand how it works.

ggplot(data = penguins,
       mapping = aes(x = species, y = flipper_length_mm)) + 
  geom_boxplot(aes(color = species), width = 0.3, show.legend = FALSE) + 
  geom_jitter(aes(color = species), alpha = 0.4, show.legend = FALSE, width = 0.2) + 
  scale_color_manual(values = c("darkorange","purple","cyan4")) +
  labs(x = "",
       y = "Flipper length (mm)") + 
  theme_bw()    

Re-plot this figure using a violin plot instead of a boxplot. Hint: there is a geom object for violin plots. Set the width of the violins to be equal to 0.4. Also add a title and a caption to the plot. Also change the base_size of the fonts to 16.

Note: since we are using a random jitter for the points, the location of the points on your plot may not look exactly like they do on mine. That is ok.

Help!

how to look things up when you don’t know what do to?