Reading:
Deploying an AWS VPC with CloudFormation
Share:

Deploying an AWS VPC with CloudFormation

Avatar
by Asher
August 6, 2020
Deploy a full stack

That's a full stack, and it's what we're going to deploy and walkthrough in full detail

We use ECS with Fargate for a few different applications here at Tree Schema. We love the ability to deploy Fargate containers as spot instances and the security that comes with deploying to a managed service that cannot be accessed (unless you explicity install an SSH client). There are a lot of different tutorials out there for how to deploy servics to ECS and all in all, ECS is actually fairly easy to use. However, for all of the tutorials out there we find it difficult to point our new engineers to one that will help teach the entire stack, from the VPC networking to configuring autoscaling so we'll attempt to provide a comprehensive overview as a five part series.

To make things interesting, we will build everything from the ground up. Here is my initial thought on how this set of tutorials will pan out but the specifc scope for any particular article may change as it starts to take shape. This page will be updated with each article released.

  • Part 1 (this article):
    • A complete VPC with security groups, subnets, NAT gateways and more
  • Part 2:
    • Deploying an ECS cluster and IAM roles for Fargate services
    • Setting up a CloudFront distribution to serve static files
  • Part 3:
    • Creating a simple Django app with a celery backend to process asynchronous requests
  • Part 4:
    • Creating an RDS database & Redis instance
    • Registering the Django app in ECR and deploying it to ECS
  • Part 5:
    • Configuring the deployment to an use Nginx to route events to the Django app and to block malicious activity

Contents

Overview

This entire set of tutorials assumes that you have some basic AWS knowledge and a little bit of Python & JavaScript experience. The intent is to cover each of the main components for deploying an autoscaling ECS service in AWS starting from scratch - meaning that anyone with an AWS account can follow along and be able to deploy this application on their own and hopefully be able to extend the code to meet their own needs. I'm a strong believer in repeatability, to ensure that we can execute this in different environments with confidence we'll deploy (almost) everything through CloudFormation Templates.

You will be available to find all of this code on the Tree Schema GitHub page. The repo will be updated as articles are added. With that quick intro let's jump straight into it!

Creating a VPC

A VPC, or Virtual Private Cloud, can simply be thought of as a secured set of networking components that only you and your resources have access to. You have the ability to allow as much, or as little, external traffic into your VPC as you'd like. AWS creates a default VPC for you in every region but we will be creating a new VPC, from the ground up to give ourselves complete control.

When creating a VPC, always, always, always use a CloudFormation template (CFT). If you are creating your VPC manually you will forget a routing table rule, forgo a security group ingress, allow access from the wrong port or do something else trivial that will be a pain when trying to figure out why something works in one environment but not another. The CFT template located here is similar to what we use at Tree Schema. I'll use this as the basis for all deployments in the tutorials; the resources for this VPC looks like:

AWS VPC Deployment
VPC Deployment

Let's step through the template located in the link above, I'll give context about each of the high level concepts will be provided for each resource as we get to it. I'm not going to cover every section in the template since a lot of them are self-explanitory. Also, for full disclosure,we found this template from another source online and we've only made a few minor changes to it. Unfortunately, the specific reference has been lost (if you know where this is from please let us know and we'll update the reference!).

Template Parmeters

The paramers section of a CloudFormation template gives you the ability to dynamically change your deployment. These are the parameters listed in our template and what the impact is for changing each:

  • EnvStageName: This is just a label, I find that using "dev", "qa" or "prod" is a great way to allow you to understand what environment you're currently working in if you look at the resources in the AWS console
  • CIDRBlockIP: The CIDR range for the VPC, the input simply needs to be the form of XX.XX, where XX is any number between 1 and 255, this template automatatically appends 0.0/16 to the end to give you the broadest possible CIDR in AWS. Just make sure you pick a valid CIDR
  • AvailabilityZone1: The first of two availabity zones to deploy your subnets into, AWS reccomends that you use at least 2 availabity zones for high availabity, this value must be a valid AWS availabity zone
  • AvailabilityZone2: The second of two availabity zones
  • ELBIngressPort: The default port to allow Elastic Load Balancer (ELB) HTTPS ingress on, you will see later on that the ELB is used to ensure HTTPS access before connecting to our app, so keep this at 443, the HTTPS port
  • ELBHttpIngressPort: The default port to allow Elastic Load Balancer (ELB) HTTP ingress on, when we deploy a load balancer we will set up a rule to route all HTTP traffic to HTTPS, we will use port 80, the default HTTP port, to ensure that users are able to access our app even if they first attempt to connect via HTTP
  • AppIngressPort: The port that the Django app will be listening on, this will allow traffic from the ELB security group to this port
  • DbInVpc: This is whether or not a database is being placed in the VPC, since databases contain our data and data should always be secured we will create an additional security groups specificly for the databases that have even more restricive access

The VPC Container

The VPC itself is straight forward and by itself doesn't really do anything, we define the full CIDR range as the "beginning IP" and the "second value" as defined in the parameters and the mapping sections as seen here:

VPC Resources
VPC Resource

We generally use the same values for the last 2 numbers in the CIDR so our internal VPC CIDRs typically look something like 22.1.0.0/16, 22.2.0.0/16, etc. If you left the template as-is the CIDR for your VPC will be 10.20.0.0/16. The actual value of the first two numbers doesn't matter as long as it doesn't conflict with other VPCs in the same region. There are reasons why you would pick a specific CIDR range but they are outside the scope of this.

Subnets

Subnets are the physical location where your services or servers are deployed. They reside within an availabity zone which is tied to a specific AWS data center. There are two flavors of subnets:

  • Public: These subnets can connect directly to the internet gateway and therefore external traffic can enter the subnet (if your security groups allow)
  • Private: These subnets do not directly to the internet gateway, they can, however, access the internet through a NAT Gateway; in general it's a good practice to put as much of your services in the private subnet and to only allow access through pre-defined channels, such as an AWS Elastic Load Balancer or a bastion server that resides in a public subnet

As seen above, this deployment creates four total subnets, two private and two public. One of each private and public subnet are placed into one of each of the availabity zones to allow for greater fault tolerance and availabity. We are also tagging the subnets with a Name to make them easier to view in the console. Two of the four subnets can be seen here:

AWS subnet resource
AWS Subnet Resource

Routing Tables

Routing Tables (RT) have one or more subnets assigned to them and they determine how to route (duh), traffic for resources that are making requests within the subnet.

Let's consider an example where a Lambda is deployed into a VPC; when the lambda makes a request to a service deployed on a host that resides within the same VPC the routing table keeps the traffic within the VPC. But let's say that the Lambda is making the request to a database that exists outside of the VPC, either in another VPC or somewhere across the internet. In this scenario the routing table determines if the traffic should be routed to a peering VPC (if set up), to the public internet, across a VPN if one is set up or to another available option.

An example of a full routing table (not the one in this walkthrough) can be seen here:

AWS routing resource
AWS Routing Resources (actual values masked)

This RT can route traffic internally, to the S3 VPC Endpoint, to the internet gateway, to a peering VPC connection and to two different VPNs. When a request comes into a routing table it will attempt to resolve the request from the most specific rule to the least specific. While the actual values are masked, if this table received an outgoing request to IP address 1.1.1.1 then the traffic would be routed to second to the bottom route which is a VPN. If a request is made to IP address 3.3.3.3 we can see that it doesn't hit any of the rules except for the catch all (0.0.0.0) therefore the request would be routed to the internet gateway and out into the internet.

In the CFT template we are using, the RT resources follows this basic logic:

  • Create the Routing Table
  • Create the Internet Gateway
  • Create the VPC Gateway Attachement
  • Create the RT route - which tells us where traffic should go
  • Associate the route to the subnet
AWS routing resource
AWS Routing Table Resources

Security Groups

When we deploy some service or server, whether that is an ECS service, an AWS Lambda, an Ec2 server, a Redshift cluster or any other service that sits within our VPC we must assign at least one security group to the service or server. Security groups define the rules for allowing traffic into (e.g. ingress) and out of (e.g. egress) the service that it is attached to. Since security groups are attached to services and servers, when two services communicate to each other within a VPC the security group receiving the request must allow ingress from either:

  • The security group attached to the service that is sending the request
  • A CIDR block that includes the IP address where the request is coming from

Similarly, the security group sending the request must allow egress to one of these as well.

Let's look at this example, consider that we have a VPC with two security groups, one security group that we give to our Elastic Load Balancer (which we will cover later on) and a second security group that we give to our application:

  • ELB Security Group
    • Allows ingress access from everywhere on port 443
    • Allows egress output to everywhere on all ports
  • App Security Group
    • Only allows ingress access from the ELB security group on port 9876
    • Allows egress output to everywhere on all ports
VPC Security group example
Security Group Example

In this scenario the user is able to access the application by going through the ELB security group to get to the app. However, the app sits within a security group that does not allow access from anywhere on the internt so if the user attempts to connect directly to the App Security Group they will not have access.

The reason that we create these two security groups is that we want to apply different levels of control for different parts of our deployment. In this specific example we want to enfore HTTPS access through the Elastic Load Balancer (ELB) so that the ELB can serve a certificate for our application. We also only allow access to the app security group through a single port - the one that is required to be exposed to allow data into the app.

Security groups can have ingress and egress added inline or through separate resources, both examples are shown here with the port ranges used being the values that were defined in the parameters:

VPC Security group setup
Security Group Example

NAT Gateway

AWS Network Address Translation Gateway's (NAT GW) are how you can allow servers in your private subnets to access the internet without allowing the internet to access your servers. To deploy a NAT GW you must also have a public subnet, the NAT will actually sit inside your public subnet and the routing table for your private subnets will route traffic intended for the internet into the NAT GW.

Creating a NAT GW follows these basic steps:

  • Create an elastic IP address - any traffic you send to the internet will appear as though it's coming from this IP
  • Create a NAT GW resource and associate the elastic IP address, the NAT GW should be placed in a public subnet
  • Create a route in the routing table that sends traffic to the NAT GW (note - in the image below we also create the private subnet routing table)
NAT Gateway Resource
AWS NAT Gateway Resource

Template Outputs

CFT outputs are a how you can expose the values within on CFT to other CFTs. In later examples you'll see that we're importing values from this VPC CFT. I'm also going to strongly suggest that when you deploy your services that you have a standardized naming convention for each type of resource. You'll see later on where having this naming convention will do wonders for automation in your CFTs and will also allow you to dynamically build values that exist in other CFTs.

One downside to importing values from one CFT into another is that you create a hard dependency and you cannot delete a resource from one stack if that resource is being imported in another stack. This may be beneficial for production where you don't want to accidently delete a resource but sometimes in dev you just need to start fresh and having this hard dependency can be annoying.

The naming convention that we like, and that is generally shown in the template is:

{Stack Name}-{LogicalName}-{EnvironmentName}

We dind't come up with this, some other great blog had this idea so we took it. We don't follow this to a tee but it still works quite well. Having further standards for the naming convention such as stack names should all be lowercase with hyphens and logical names should always be camel case allows users to quickly see and understand what stack and what resource they are viewing. Here are examples from the template we're creating:

NAT Gateway Resource
AWS NAT Gateway Resource

Database in the VPC

One item that was briefly alluded to in the template section is a paramter that you will need to enter when creating this VPC. By default the paramter value DbInVpc is set to true which will create an additional security group that is intended to be the security group that is given to RDS databases.

You may notice that we've already pre-configured this to allow both the bastion security group and the app security group to be able to access both Postgres and Redis default ports within the database security group (if it is created). We will be using these databases in subsequent articles but if you are deploying a service that does not need a database simply set this to false or remove all references to this condition from your template.

Deploying this VPC

To deploy this, clone the GitHub repo above and in the root directory run

          
sam deploy \
--template templates/vpc-template.yaml \
--stack-name {your-vpc-stack} \
--capabilities CAPABILITY_AUTO_EXPAND
          
          

We'll be using my-vpc-stack as the stack name for this and we'll reference this in the subsequent articles.

Closing thoughts

Having a repeatable and consistent layout for VPCs is, in my opnion, one of the easiest ways to make your life easier if you are deploying VPCs to different environments, regions or accounts. Furthermore, standardizing outputs and the naming conventions can allow you to quickly redeploy resources from one VPC to another by simply replacing the name of one VPC stack with the name of a new stack without having to touch anything else. This is especially valuable once you need to deploy the same service in multiple places or if you're migrating code from one VPC to another.


Share this article:

Like this article? Get great articles direct to your inbox