Data wrangling & visualization II — dplyr
Week 2
Overview
This week we will work through some concepts in data wrangling using the dplyr
toolbox which is part of the tidyverse
add-on in RStudio.
- Work through the mandatory readings below.
- Review the week 2 slides
- Attend the lab session to go through the Lab Component of Homework 2.
- Complete the Home Component of Homework 2
Required Readings
Data Frames
Open up RStudio on your computer and watch this video: Data Frames by Matt Crump. You should be able to answer the following questions:
- What is a data frame?
- How to make a data frame
- How to change the column names of a data frame
- How to select the data in one column of a data frame
- how to convert strings to factors
Data Transformations using dplyr
Open RStudio and work through the following sections in R for Data Science by Hadley Wickham & Garrett Grolemund. We will be concerned primarily with the material in Chapter 4 (Data transformations) in lecture, which is important to understand, but it is also important to familiarize yourself with the material in the other readings. Chapter 7 will be useful as well.
- read 3 Data transformation
- read 6 Workflow: scripts
Additional supporting materials
For a more complete understanding of data formats and the concept of “tidy data”, read through the following Chapters as well. Chapter 8 on data import could be useful to you in the future as well.
There are additional chapters in R for Data Science that may be useful to you in the future, as you get more famililar with R and with the functions in the tidyverse packages.
- 13 Numbers
- 14 Strings
- 16 Factors
- 17 Dates and times
- 18 Missing values
- 19 Joins (and let’s talk about joins for some extra help)
Here are some readings on programming in the R language (for those of you with previous coding experience or who want to learn details about programming in the R language).