Data wrangling & visualization I — ggplot

Week 2

Overview

  1. Work through the Readings (see below)
  2. Review the week 2 slides
  3. Attend the lab session to go through the Lab Component of Homework 2.
  4. Complete Homework 2

This week I will introduce the course and our approach for the term. We will start in on data wrangling and data visualization in RStudio. I will introduce some basic concepts in the lecture.

You should go through the readings on your own to get a fuller appreciation of the breadth and depth of the tools available for data wrangling and visualization. We will be making use of data wrangling and data visualization throughout the course. Your skill level will increase the more you use it. You will likely have to refer back to the data wrangling and visualization resources frequently as we work through the course. As will I. Some of the commands are esoteric.

This week we will work through data visualization using ggplot2, one of the many toolboxes included in the tidyverse add-on to RStudio.

The homework assignment will ask you to complete a series of basic data wrangling and visualization tasks.

Attend the lab/tutorial to get help with the homework assignment and/or to ask questions about the lecture material.

Next week we will look at more advanced data wrangling.

Readings

I will go through some of the highlights in lecture, and I will demonstrate some of the main principles. However you are responsible for everything in the readings not just what I demonstrate in the lectures. The homework assignments will give you a sense of what the expectations are. The weekly labs are your opportunity to consolidate the material in the lectures and your readings, so that you can make progress on the homework assignments.

The goal of this course is not to overwhelm you with material. Below I provide some primary readings and then some other sources of information that are useful as references.

I’m calling these “readings” but really you should be doing more than just reading this material. Have RStudio open on your laptop and type in the sample code as you work through the material below. Reading + typing-in-the-code + playing-around will help you learn much better than just reading on its own. Avoid copying and pasting code. Type it in. It’s much more effective for learning the material.

Required Readings:

Open up RStudio on your computer and carefully work through the following sections in R for Data Science by Hadley Wickham & Garrett Grolemund:

Other optional resources

It’s also worth looking over the following sections in ggplot2: Elegant Graphics for Data Analysis by Wickham, Navarro, & Pedersen. You perhaps don’t need to go through this in as much detail as the above readings, but you might look through it at least to know where information you might need can be found: