Data wrangling & visualization II — dplyr

Week 2


Overview

This week we will work through some concepts in data wrangling using the dplyr toolbox which is part of the tidyverse add-on in RStudio.

  1. Work through the mandatory readings below.
  2. Review the week 2 slides
  3. Attend the lab session to go through the Lab Component of Homework 2.
  4. Complete the Home Component of Homework 2

Required Readings

Data Frames

Open up RStudio on your computer and watch this video: Data Frames by Matt Crump. You should be able to answer the following questions:

  1. What is a data frame?
  2. How to make a data frame
  3. How to change the column names of a data frame
  4. How to select the data in one column of a data frame
  5. how to convert strings to factors

Data Transformations using dplyr

Open RStudio and work through the following sections in R for Data Science by Hadley Wickham & Garrett Grolemund. We will be concerned primarily with the material in Chapter 4 (Data transformations) in lecture, which is important to understand, but it is also important to familiarize yourself with the material in the other readings. Chapter 7 will be useful as well.

Additional supporting materials

For a more complete understanding of data formats and the concept of “tidy data”, read through the following Chapters as well. Chapter 8 on data import could be useful to you in the future as well.

There are additional chapters in R for Data Science that may be useful to you in the future, as you get more famililar with R and with the functions in the tidyverse packages.

Here are some readings on programming in the R language (for those of you with previous coding experience or who want to learn details about programming in the R language).