Introduction to the Tidyverse


Training

Introduction to the Tidyverse

1. Introduction

Exegetic Analytics is a Data Science consultancy specialising in data acquisition and augmentation, data preparation, predictive analytics and machine learning. Our services are used by a range of industries from Education to Security, Food Delivery to Politics. Our consultants are based in Durban and Cape Town and we engage with clients all over the world. Our products and services are used by a multitude of industries including Aerospace, Education, Finance, Food and Transport.

Exegetic Analytics also offers training, with experienced and knowledgeable facilitators. Our courses focus on practical applications, working through examples and exercises based on real-world datasets.

All of our training packages include access to:

  • our online development environment and
  • detailed course material which participants will have continued access to even once the training has concluded.

For more information about what we do, you can refer to our website.

These are some of the companies who have benefitted from our trainning:

Take a look at our full list of courses to see what other training we have on offer.

Contact Us

If this proposal is of interest to you or you would like to hear more about what we do you can get in touch on training@exegetic.biz or +27 (0)73 805 7439.

2. Course Description

The Tidyverse is a collection of integrated packages for performing Data Science by applying “tidy” data principles. The packages are all based on a common design philoshophy and implement consistent grammar and data structures. They form a broad basis for a wide range of analyses in R.

This course is an excellent entry point to working with data in R.

Details

Duration 2 days
Who should attend? This course is suitable for anybody who uses a spreadsheet to work with data. No prior programming knowledge required.
Objectives Learn to use the Tidyverse for basic data analysis and visualisation.
Outcomes Participants will be comfortable using packages from the Tidyverse to work with with data. Specifically they will be able to:

  • import data from CSV and spreadsheets;
  • perform an array of data cleaning and manipulation operations; and
  • create attractive visualisations.
Setup A laptop with recent versions of R and RStudio.

Return to our list of courses.

Course Outline

3. Course Outline

Day 1

  • Introduction
  • {tibble} — An alternative to the data.frame
  • {magrittr} — Pipes!
  • {readr} and {readxl} — Reading Data
    • CSV (and other delimited) files
    • Spreadsheets
  • {dplyr} — Wrangling Data
    • Working with columns: select(), rename() and mutate()
    • Working with rows: filter() and arrange()
    • Aggregation: group_by() and summarise()
  • {ggplot2} — Plotting Data
    • Components of a visualisation (geoms)
    • Mapping data to components (aesthetics)
    • Faceting
    • Themes

Day 2

  • {tidyr} — Pivoting Data
    • What is “tidy data”?
    • Pivoting data with gather() and spread()
    • Splitting and combining data with separate() and unite()
  • {stringr} and {glue} — Working with Strings
    • Splitting and combining strings
    • Extracting substrings
    • Pattern matching (and a short introduction to Regular Expressions)
    • Dealing with whitespace
    • String interpolation
  • {lubridate} — Working with Dates and Times
    • {anytime} — Handling dates and times in a wide variety of formats!
  • {forcats} — Working with Categorical Data
    • Renaming levels
    • Changing order of levels
    • Dropping empty levels
    • Lumping levels
  • {purrr} — Functional Programming Tools
    • Introduction to Functional Programming
    • Applying a function to elements of a vector (or list) with map()
    • Using map2() and pmap() for multivariate functions
    • Generating side-effects with walk()
    • Dealing with errors insistently() and using delays

Book now!

Training Philosophy

Our training emphasises practical skills. So, although you'll be learning concepts and theory, you'll see how everything is applied in the real world. We will work through examples and exercises based on real datasets.

Requirements

All you'll need is a computer with a browser and a decent internet connection. We'll be using an online development environment. This means that you can focus on learning and not on solving technical problems.

Of course, we are happy to help you get your local environment set up too! You can start by following these instructions.

Package

The training package includes access to
  • our online development environment and
  • detailed course material (slides and scripts).

Return to our list of courses.