Categorically Variable

Only search Categorically Variable.

Analysis of Feedback from satRday [Cape Town] 2017

We recently announced the second satRday (Cape Town) conference scheduled to take place on 17 March 2018. Obviously we want this to be bigger and better than this year’s event, so we are paying careful attention to the feedback that we received from the first event.

This is a quick analysis of the feedback. We sold 192 tickets and gave out 11 complimentaries to the event. There were 107 responses to the feedback survey, which means that we heard back from more than half of the people who attended, which is hopefully a representative sample.

 Read more

Durban Twitter Analysis

I was invited to give a talk at Digifest (Durban University of Technology) on 10 November 2017. Looking at the other speakers and talks on the programme I realised that my normal range of topics would not be suitable. I needed to do something more in line with their mission to “celebrate the creative spirit through multimedia projects from disciplines such as visual and performing arts” and to promote “collaboration across art, science and technology”. Definitely outside my current domain, but consistent with many of the things that I have been aspiring to.

To be honest, I was pleased to be invited, but when I sat down to consider what I would talk about, I found myself at a loss. I’m not currently engaged in anything that ticks many of those boxes.

But I am loathe to turn down an opportunity to speak. So I made a plan. In retrospect it was not a terribly good plan. But it was workable. I decided to speak about gauging sentiment relating to the city of Durban using data from Twitter. You can find my talk outline here. This post touches on some of my results.

 Read more

Hello Durban, tell me how you're doing!

Hello Durban, tell me how you're doing!
Everybody speaks their mind on Social Media. So, what are people saying about Durban? I'll harvest the Twitter stream to answer this question, making use of some simple analytics, sentiment analysis and machine learning.

This talk was presented at Digifest (Durban University of Technology) on 10 November 2017.

 Read more

Conference Bucket List

A list of conferences I’d like to speak at in the next few years.

 Read more

Installing NVIDIA Graphics Driver on Ubuntu

Recipe for installing the NVIDIA binary drivers on Ubuntu.

 Read more

Running OSRM with Docker

I’ve now been through the process of setting up OSRM a few times. While it’s not exactly taxing, it seemed like a prime candidate for automation.

 Read more

Exporting HTML Presentations to PDF

Building a presentation with reveal.js is such a pleasure. And the results looks so good. Seriously doubt that I will ever use anything like PowerPoint again. Although it’s possible to export a presentation directly to PDF using a style sheet, this doesn’t always work perfectly (IMHO).

Fortunately there’s another way: decktape. It works with reveal.js and a bunch of other HTML5 presentation frameworks.

 Read more

Quick Wordpress Install with Docker

I’ve just put together a Wordpress site for my older daughter. It’s hosted on DigitalOcean and all of the infrastructure is handled with Docker. This post describes the steps in the (easy) install process.

 Read more

Diagnosing Killed Jobs on EC2

I’ve got a long running optimisation problem on a EC2 instance. Yesterday it was mysteriously killed. I shrugged it off as an anomaly and restarted the job. However, this morning it was killed again. Definitely not a coincidence! So I investigated. This is what I found and how I am resolving the problem.

 Read more

Removing Redundant Hostnames with NGINX

While poring over my Google Analytics data I noticed the notification below.

Obviously this is not a train smash, but it is compromising the quality of my data. And it also offends my OCD. This is what I did to fix the problem.

 Read more

Hosting a Plumber API on AWS

I’ve been putting together a small proof-of-concept API using R and plumber. It works flawlessly on my local machine and I was planning on deploying it on an EC2 instance to demo it for a client. However, I ran into a snag: despite opening the required port in my Security Group I was not able to access the API. This is what I needed to do to get it working.

 Read more

Installing Docker on Ubuntu

This procedure works on both my laptop and a fresh EC2 instance.

 Read more

Creating a S3 Bucket

There are many good reasons to use S3 (Simple Storage Service) storage. This is a quick overview of how to create a S3 bucket.

 Read more

Web Scraping Workshop at PyCon 2017 (Cape Town)

In a little under a month PyCon 2017 will be happening in Cape Town. I’m really looking forward to the conference and rather excited about giving a workshop on Web Scraping in Python. This is the abstract for the workshop.

 Read more

Creating an AWS Spot Instance

EC2 Spot Instances can provide very affordable computing on EC2 by allowing access to unused capacity at significant discounts.

 Read more

Building a Local OSRM Instance

The Open Source Routing Machine (OSRM) is a library for calculating routes, distances and travel times between spatial locations. It can be accessed via either an HTTP or C++ API. Since it’s open source you can also install locally, download appropriate map data and start making efficient travel calculations.

These are the instructions for getting OSRM installed on a Ubuntu machine and hooking up the osrm R package.

 Read more

Global Variables in R Packages

I know that global variables are from the Devil, but sometimes you just can’t get around them.

I’m building a small package for a client that relies on a data file. For various reasons that file is not part of the package and can reside in different locations on users’ machines. Furthermore there are users on both Windows and Linux machines.

 Read more

Driving AWS from the Command Line

Although it’s very handy (and easy) to set up some cloud resources using the AWS Management Console, once you know what you need it makes a lot of sense to automate the process. Fortunately there’s a handy little command line tools, aws, which makes this eminently possible. The AWS CLI Command Reference is the definitive resource for this tool. There’s a mind boggling array of possibilities. We’ll take a look at a small selection of them.

 Read more

Route Asymmetry in Google Maps

I have been retrieving some route information using Rodrigo Azuero’s gmapsdistance package and noted that there was some asymmetry in the results: the time and distance for the trip from A to B was not necessarily always the same as the time and distance for the trip from B to A. Although in retrospect this seems self-evident, it merited further investigation.

 Read more

Retrieving Kaggle Data from the Command Line

We’ve been building some models for Kaggle competitions using an EC2 instance for compute. I initially downloaded the data locally and then pushed it onto EC2 using SCP. But there had to be a more efficient way to do this, especially given the blazing fast bandwidth available on AWS.

Enter kaggle-cli.

 Read more

Categorically Variable