Driving AWS from the Command Line
Although it’s very handy (and easy) to set up some cloud resources using the AWS Management Console, once you know what you need it makes a lot of sense to automate the process. Fortunately there’s a handy little command line tools,
aws, which makes this eminently possible. The AWS CLI Command Reference is the definitive resource for this tool. There’s a mind boggling array of possibilities. We’ll take a look at a small selection of them.
aws tool is a Python script. Installation is very simple, just follow the detailed documentation.
Specify your AWS Access Key ID and Secret Access Key.
You’ll need a SSH key to connect to your remote resources.
The result from
aws ec2 create-key-pair is a JSON document, from which we extract the value for
KeyMaterial using the command-line JSON processor
Apply restrictive access permissions to the resulting PEM file.
Create a Security Group. This will be used to determine what services will have access to your resources.
Add some rules for inbound connections. Here we allow ports 22 (SSH), 80 (HTTP) and 443 (HTTPS).
Then check if everything is configured as expected. In light of the volume and complexity of the output of this command you might find it expedient to simply use the AWS Management Console.
Since we’ll be running
aws from the shell it’ll make our lives easier if we first set up a few environment variables.
Specify the region in which the resources are going to be deployed.
The name assigned to your SSH key.
Elastic Compute Cloud (EC2)
Launch an EC2 instance using
aws ec2 run-instances. You can find an appropriate image ID in Step 1 of the EC2 Launch Instance wizard.
Provided that the above command executed without error, you should have a running EC2 instance. Check out the Instances tab on the AWS Management Console. You can now connect to the remote instance using SSH.
Elastic Map Reduce (EMR)
There’s a wide variety of clusters that can be deployed using EMR. We’ll put together a small Spark cluster.
First we’ll need to create two new Security Groups,
spark-masterhas the same permissions as
generalbut also allows inbound TCP connections on port 8001;
spark-slavehas no inbound permissions.
Then run the script below, which will create a cluster with four nodes (one master and three workers). The nodes are provisioned with Spark and a few other pertinent applications. A bootstrap script also sets up JupyterHub.
--password parameter sets up the JupyerHub password for the
There are a host of other parameters which can be passed to the bootstrap script. Of particular interest are:
--r— install a kernel for R;
--julia— install a kernel for Julia;
--ruby— install a kernel for Ruby;
--ml-packages— install Python Machine Learning and Deep Learning packages;
--python-packages— install arbitrary named Python packages;
--port— port for Jupyter notebook (defaults to 8888);
--password— password for Jupyter Notebook.
It might take a while to bring up the cluster. The bootstrap process appears to be somewhat time consuming. However, if you’re patient then in good time (an hour or so!) you’ll have a fully provisioned Spark cluster with JupyterHub running on it.
The JupyterHub interface will be available on port 8001 on the master node. Find out more about this setup here.
Simple Storage Service (S3)
S3 provides storage space which can be readily accessed from other resources on AWS.
Creating a S3 Bucket
Storage on S3 is divided into containers called “buckets”. Creating a bucket is simple with
aws s3 mb.
Copying Files to a Bucket
Local files can be copied across to a S3 bucket using
aws s3 cp. You can restrict access to a file using the
aws s3 mv and
aws s3 rm are analogous to their UNIX equivalents, moving and deleting files on S3.
aws s3 sync is used to synchronise the contents of folders, either local or on S3.
Listing Buckets and Their Contents
You can get a list of available buckets using
aws s3 ls.
If you provide the URL for a particular bucket then you can also see its contents.
Destroying a S3 Bucket
When you’re done with your bucket you can destroy it with
aws s3 rb. The
--force argument is required to delete a bucket which still contains files.
S3 and Static Web Sites
You can use the
aws s3 website command to turn the contents of a S3 bucket into a static web site.