Diagnosing the Long Distance Runner with Django, D3 and R


This was the talk that I presented at the Bulgaria Web Summit.

Diagnosing the Long Distance Runner with Django, D3 and R

TL;DR Building the ultimate runners' portal using Django, D3 and R.

Runners enjoy running. Obviously! They're also somewhat obsessed with statistics: weekly mileage, marathon times, resting heart rate, medal count... Back in the day these would be recorded by hand in a dog-eared logbook. But we're too sophisticated for that now.

Services like Strava allow a runner to track their training statistics online and easily make comparisons with their running mates. However, the ultimate test of a runner is the race. And race statistics are distributed across a variety of sites on the internet. There's no single resource.

Wouldn't it be handy if all of those results were aggregated in one place? A runner could then see a consolidated picture of their racing prowess. It'd be cool if it was augmented with responsive visualisations and predictive models.

I'll be talking about an ambitious project to carefully synthesise those distributed race statistics, using R for scraping and modeling, Django for data management and presentation, and D3 for visualisation.

I'll also discuss the models that I've built using this meticulously validated set of data and the intriguing things that they've revealed about that curious animal, the Long Distance Runner.

Categorically Variable