We’ve been building some models for Kaggle competitions using an EC2 instance for compute. I initially downloaded the data locally and then pushed it onto EC2 using SCP. But there had to be a more efficient way to do this, especially given the blazing fast bandwidth available on AWS.
This will expose the kg shell command. You can use it interactively or via an selection of command line arguments.
We’d use the download command to get the data for a particular competition.
The minimum requirements for this to work are a username, password and competition identifier. You get the latter by simply visiting the competition page on Kaggle and grabbing the last part of the URL.