Amazon Web Services

Welcome

Thus far, the software we have created has run locally. Such an approach is fine for testing, but at some point, it's good to have a reliable and dedicated host for your applications. Amazon Web Services (AWS) is probably the most popular approach. AWS has many services, but for our purposes we only need one: the "Elastic Compute Cloud (EC2)". In effect, an EC2 instance is just a virtual machine in the cloud, that you can configure however you want.

What's nice about EC2 (and all of AWS) is that you can tune the size of your application at all times. It's just a matter of requesting (and paying for) more resources.

In this week's tutorial, we'll get familiar with AWS/EC2 by deploying a node.js web application to the cloud. We'll use MongoDB again, and we'll add an intermediate caching layer to reduce load on the MongoDB instance.

Step 1 : Getting an Instance

To sign up for AWS, visit this page and click on "try AWS for free". You'll need to provide a phone number and a credit card, and Amazon will call you to verify your account creation.

It takes a little while to activate your account, so you might want to skip forward to other parts of this tutorial while you are waiting.

Once your account is activated, you can launch the management console to create an instance. You should go to the EC2 dashboard and launch an instance. I picked Ubuntu 14.04, and then selected the t2.micro option, which is the only free option. Note that you shouldn't just click "Review and Launch", since you'll want to have a chance to edit the details of the configuration. Instead, select "Configure Instance Details".

It turns out that we don't need to worry about instance, storage, or tag configuration yet. But we should add ssh support. To do this, on the Security Group page, there's a rule waiting for you to add. Add it and you're ready to "Review and Launch".

You should get a message about the security of your instance being low (because each of the open ports is accessible from anywhere in the world). We won't worry about that yet, but it is something you should revisit at some point. Go ahead and click "Launch".

You'll be prompted to create a new key pair. This is going to be an incredibly important file, and you can only download it once. I named mine "398awstutorialkeypair". Once you've downloaded your key pair, click "Launch Instances". Then go to "View Instances" and you should see your instance listed, with a status of "Initializing".

To ensure that everything is good, you'll need to ssh into your new virtual machine. Make note of your "Public DNS". It's the address to use in the command below:

You should be able to log in and reach a Linux prompt. If so, you're all set with a virtual machine that is running in the cloud, accessible from everywhere in the world, and ready for us to use to build a new application.

Step 2 : A Quick Node App

Our new instance doesn't have much code running on it. Let's go ahead and install node.js:

Now go ahead and do a express -t ejs myapp to create a new node application that uses express and ejs. Then go into the myapp directory, do npm install and nodejs app.js.

When you try to visit your node application, it won't be visible from your development machine. If you log onto your AWS instance, you can see the page from there... but there is no firewall entry for your application, so you can't get to it from the outside world. You'll need to fix that from the "Security Groups" section of your EC2 Management Console.

Note that this takes two steps. First, you go to "Network & Security", choose "Security Groups", and create a group that opens access to some web server port. Then, go back to "Instances", pick your instance, go to "Actions"/"Networking"/"Change Security Groups", and "Assign" that group to your instance.

With that done, you should be all set. Go ahead and try it. You should be able to start your node app, go to the appropriate address:port, and see the default "Welcome to Express" page.

Step 3 : Adding a Database

Part of what's so great about AWS is that everything within the Amazon ecosystem is extremely fast. This is important... if we want our application to be able to scale out to lots of machines, we can't centralize everything in one place. So let's go ahead and start up a new machine to run our MongoDB instance.

"Wait", you may be thinking, "I only have one free instance". Well, sure, that's true, but there are free MongoDB providers out there who will host a Mongo instance for you in AWS. Let's try it out.

First, go to mongolab.com and create an account. Then it's time to create a database. Create a new "MongoDB Deployment", and pick AWS as your provider. For your plan, choose "Single-node" and "Sandbox", so that you don't have to pay for anything. Name the database "aws-test" and hit create.

At this point, you should have a deployment, but it isn't exactly usable yet... in particular, you need to have a user and a collection. Make a user "aws", and give it a good password. Then create a collection called awscollection.

To make sure that everything works, ssh into your AWS instance and add a mongo client:

The mongolab console will have instructions on how to connect to your instance from the shell. It's going to be something like mongo ds00000.mongolab.com:61000/aws-test -u aws -p your_pasword. Connect, and then let's create some data:

In addition to the command line, you should be able to explore your database from the mongolab web interface.

Step 4 : Connecting Node and Mongo

It's time to connect our web server to mongodb. First of all, do an npm install mongodb to get a simple Mongo driver. Then it's time to build a server. Let's not stand up a whole express app to do this, though. Instead, we can do a quick hack to show that everything works. You'll need a sane development environment (sudo apt-get install emacs24-nox is probably sufficient for now, you can add java and gcc later), then enter this code, making note of the places where your details will differ from mine.

When you visit your site, you should see a list of objects. Sure, that's not particularly exciting, but it's a start!

Step 5 : Adding Memcached

In real web applications, performance matters. Databases are typically a bottleneck, and a common solution is to introduce a lookaside cache. The cache is completely managed by the application, and serves as a way to avoid the expense of going to the database.

In this step, we are going to use memcached. It's one of the most popular in-memory caches. We aren't going to actually do any performance measuring (that's a next step), but we'll trust that this speeds things up.

To make the task more realistic, we're going to create a second machine instance. Be careful: Amazon only gives 750 free hours per month for the first year. That's one always-on machine. If we're going to have a second instance running, then to stay within budget, you'll want to shut all instances off for some of the time.

With that said, go ahead and create a new Ubuntu instance. You can use the same keyfile as for your previous instance. Once this instance is up and running, create a new security group for memcached, that has custom TCP and custom UDP rules that open up port 11211. When you make these rules, do not make them open to the world. Instead, for the source, give the security group to which your previous instance belongs.

Next, you should install memcached onto your new instance (sudo apt-get install memcached). Then you will need to edit /etc/memcached.conf manually. To do this correctly, first use ifconfig to figure out your IP address. Then edit the memcached.conf file, and in the line that starts with "-l" (that's the letter, not the number), replace 127.0.0.1 with your IP. Finally, close the file and type sudo service memcached restart to reset your server.

To ensure that your memcached is working correctly, and the security is all good, make note of the IP address of your memcached server. Then, log into your web server and type telnet 172.31.21.60 11211. If you can connect, things are looking good. Then type stats. If you get a few screens of information, then you're really good. When you're done, type quit to exit telnet.

Now it's time to enhance our web application, so that it uses memcached before going to mongo. How it does this will be completely application-dependent, and this is just a demo, so we're not going to get too crazy. Start by installing the 'mc' package through npm. Then let's write some code:

To test the code out, visit your website. You should get a list of objects. Then append '?id=' and one of those objects' ids to the end of the address, and re-visit the page. Then re-visit the page again. Then wait a minute and re-visit the page one more time. In your instance's console, you should see messages that correlate with objects being in the cache. Note that you can also telnet into your memcached server and use the stats command to verify that you did have some gets and sets.

Step 6 : Reminder

Don't forget to STOP your instances when you're not working on them, so that you don't accidentally exceed your 750-hour/month limit!

Step 7 : Next Steps

Here are a few additional tasks. You need to do all three to gain full points for this week.

Measure the time overhead of a memcached hit versus a mongodb hit. Does it really make a difference for your application?
Add some Amazon storage to your project. Amazon's S3 storage facility is free. You can find example code for using S3 here. Your end-state should be that you can store uploaded files to S3, access them through your node.js app, and also access them through the S3 web interface.
Read some tutorials on how to ensure your node.js app restarts automatically when your instance reboots. Implement a technique so that you can survive occasional outages by restarting immediately upon boot.

CSE 398