Today’s post is a mashup of various things relating to networking with Julia. We’ll have a look at FTP transfers, HTTP requests and using the Twitter API.
Only wimps use tape backup: real men just upload their important stuff on ftp, and let the rest of the world mirror it.
Linus Torvalds (1996)
Back in the mid-90s Linus Torvalds was a big fan of FTP. I suspect that his sentiments have not changed, although now he’d probably modify that statement with 's/upload/push/;s/ftp/github/'. He might have made it more gender neutral too, but it’s hard to be sure.
FTP seems a little “old school”, but if you grew up in the 1980s, before scp and sftp came along, then you’ll probably feel (like me) that FTP is an intrinsic part of the internet experience. There are still a lot of anonymous FTP sites in operation. You can find a list here, although it appears to have last been updated in 2003, so some of that information might no longer be valid. We’ll use ftp://speedtest.tele2.net/ for illustrative purposes since it also allows uploads.
First we initiate a connection to the FTP server.
Grab a list of files available for download.
This site (as its name would imply) has the sole purpose of conducting speed tests. So the content of those files is not too interesting. But that’s not going to stop me from downloading one.
Generally anonymous FTP sites do not allow uploads, but this site is an exception. We’ll test that out too.
Close the connection when you’re done.
Okay, I’m over the historical reminiscences now. Onto something more current.
There are a few Julia packages implementing HTTP methods. We’ll focus on the Requests package. The package homepage makes use of http://httpbin.org/ to illustrate the various bits of functionality. This is a good choice since it allows essentially all of the functionality in Requests to be exercised. We’ll take a different approach and apply a subset of the functionality to a couple of more realistic scenarios. Specifically we’ll look at the GET and POST requests.
First we’ll use a GET request to retrieve information from Google Books using ISBN to specify a particular book. The get() call below is equivalent to opening this URL in your browser.
We check that everything went well with the request: the status code of 200 indicates that it was successful. The request headers provide some additional metadata.
The actual content is found in the JSON payload which is stored as an array of unsigned bytes in the data field. We can have a look at the text content of the payload using Requests.text(), but accessing fields in these data is done via Requests.json(). Finding the data you’re actually looking for in the resulting data structure may take a bit of trial and error.
If the payload is not JSON then we process the data differently. For example, after using get() to download CSV content from Quandl you’d simply use readtable() from the DataFrames package to produce a data frame.
Of course, as we saw on Day 15, if you’re going to access data from Quandl it would make more sense to use the Quandl package.
Those two queries above were submitted using GET. What about POST? We’ll directly access the Twitter public API to see how many times the URL http://julialang.org/ has been included in a tweet.
The JSON payload has an element count which indicates that to date that URL has been included in 2639 distinct tweets.
We’ve just seen how to directly access the Twitter API using a POST request. We also know that there is a Quandl package which provides a wrapper around the Quandl API. Not too surprisingly there’s also a wrapper for the Twitter API in the Twitter package. This package greatly simplifies interacting with the Twitter API. No doubt wrappers for other services will follow.
First you need to load the package and authenticate yourself. I’ve got my keys and secrets stored in environment variables which I retrieve using from the ENV global array.
I’ll take this opportunity to pander to my own vanity, looking at which of my tweets have been retweeted. To make sense of the results, convert them to a DataFrame.
You can have a lot of fun playing around with the features in the Twitter API. Trust me.
The HttpServer package provides low level functionality for implementing a HTTP server in Julia. The Mux package implements a higher level of abstraction. There are undoubtedly easier ways of serving your HTTP content, but being able to do it from the ground up in Julia is cool if nothing else! Case in point: Sudoku-as-a-Service is hosted using the HttpServer package. The code is available on the project page and serves as an excellent illustration of why you might want to use Julia to serve your content directly.
That’s it for today. I realise that I have already broken through the “month” boundary. I still have a few more topics that I want to cover. It might end up being something more like “A Month and a Week of Julia”.