Data Manipulation in R

College of Forestry Workshop

An introduction to data manipulation in R via dplyr and tidyr.

This two-hour workshop is aimed at graduate students who have been introduced to R in statistics classes but haven’t had any training on how to work with data in R.

The workshop covers how to:

  • Make data summaries by group
  • Filter out rows
  • Select specific columns
  • Add new variables
  • Change the format of datasets (i.e., reshape datasets)
  • Join datasets together

Along the way students learn how to use the pipe operator to chain several data manipulation steps together. Students have time to practice data manipulation and reshaping using the babynames dataset from package babynames.

I provide an R script that we’ll run code from during the workshop as well as a PDF document. The PDF is a written version of the workshop, including code and output, to be used as a reference.