Data wrangling & visualization II — dplyr
Week 3
Overview
This week we will work through some concepts in data wrangling using the dplyr
toolbox which is part of the tidyverse
add-on in RStudio.
- Work through the mandatory readings below.
- Review the week 3 slides
- Attend the lab session to go through the Lab Component of Homework 3.
- Complete the Home Component of Homework 3
Required Readings
Data Frames
Open up RStudio on your computer and watch this video: Data Frames by Matt Crump. You should be able to answer the following questions:
- What is a data frame?
- How to make a data frame
- How to change the column names of a data frame
- How to select the data in one column of a data frame
- how to convert strings to factors
Data Transformations using dplyr
Open RStudio and work through the following sections in R for Data Science by Hadley Wickham & Garrett Grolemund. We will be concerned primarily with the material in Chapter 5 (Data transformations) in lecture, which is important to understand, but it is also important to familiarize yourself with the material in the other readings. Chapter 5 will be key for completing this week’s homework assignment.
- read 5 Data transformation
- read 6 Workflow: scripts
- read 7 Exploratory Data Analysis
Other readings for your reference (not required)
For a more complete understanding of the tibble
data object (like a data frame but with some special properties) and the concept of “tidy data”, read through the following Chapters as well. Chapter 11 on data import could be useful to you in the future as well.
There are additional chapters in R for Data Science that may be useful to you in the future, as you get more famililar with R and with the functions in the tidyverse packages.
You may simply want to familiarize yourself with what’s shown in these chapters, in case you need to look up this information later. These are not mandatory.
Here are some readings on programming in the R language (for those of you with previous coding experience or who want to learn details about programming in the R language). These are also optional, not mandatory.