Large transportation-related data sets are becoming increasingly available to practitioners. Newfound access to these large data sets may outstrip the ability of even savvy data analysts. The use of R, an open source statistical computing language, can greatly improve an analyst’s productivity and increase their skill set.
R allows users to:
- retrieve, merge, format and clean data,
- perform data augmentations,
- perform analyses,
- produce summary graphics, and
- make the process repeatable and shareable.
This workshop will showcase applications from other organizations that use R and transportation related data, highlight best practices, and provide reproducible examples for users that wish to learn and use R using PORTAL and PORTAL data. After finishing this workshop, participants will have a better understanding of the applications of using R with large unruly data. Participants should have a basic knowledge of data analysis and an interest in learning R.
Learning Objectives
- Participants unfamiliar with R will come away with a better understanding of its applications and utility in analyzing large datasets from reproducible examples and showcase of other work products from other organizations.
- Statistics is scary for some people but R can remove some of the barriers and make the application of advanced statistics more approachable. Participants will have reproducible examples that demonstrate how to examine common data sets used in transportation.
- Communicating the story is vital for good decision making. Wrangling and analyzing data can fall short without a compelling story and the picture to help illustrate the lessons. Participants will learn the offerings from R for visualization techniques to help make the most of the lessons they glean from their data analysis efforts.