Setup a Highly Available Website in AWS Using Terraform
Use Terraform to deploy a highly available NGINX website with an Application Load Balancer, two EC2 instances across different AZs, and an S3 bucket for website files and log storage. The infrastructure-as-code approach ensures repeatability and consistency across deployments.
Introduction
Have you ever awoken from a deep sleep and been in the mood to setup a highly available, load balanced website using NGINX, while having the webpages pulled securely from an S3 bucket that also happens to hold your NGINX logs? Me neither; however, I was asking for a friend. Even though the question is oddly specific, I am sure we know someone that has had the need for something similar.
This post will cover exactly the above ask and will contain steps on:
- How to create an S3 bucket
- Upload website related files to that S3 bucket
- Allow only EC2 instances to access the S3 bucket
- Create these EC2 instances
- Create a load balancer that is the front end to the EC2 instances
Think of the S3 bucket as the Crafts Service Table with all the good foods, sitting in a party room that is very exclusive with only the EC2 instances on the guest list. To further shield these guests, a security entourage has been hired to act as the 1st level of contact by anyone that is not invited to the party but wants the attention of the guests.
Although there are various ways to make the same work using other workflows (S3 static website), this post will allow us to touch upon multiple services on AWS.
Prerequisites
Since we will be running this example in AWS and we want the rinse and repeatability we will need:
- An AWS account that has programmatic access since we will be using Terraform
- Terraform installed on your machine
- An AWS IAM user with the correct permissions to create S3 buckets, create VPC and related objects, create EC2 instances, create roles/policies, and create load balancers
- Files for our website
Never commit your AWS access keys to version control. Use environment variables or AWS credentials file instead of hardcoding keys in terraform.tfvars.
AWS Objects
At the end of this run we will have Terraform code that will create in AWS:
- One VPC
- Two public subnets in two different Availability Zones
- An application load balancer
- A target group for our load balancer
- Two EC2 instances
- A S3 bucket that will contain website files for the EC2 instances to pull, as well as a place to drop logs from the EC2 instances for long-term retention and analysis
- An EC2 role with the necessary policies to read from S3 and write to S3
- All objects with consistent tags
Terraform
Tree View
Here is how my tree view looks for my Terraform project:
- terraform.tfvars - contains the actual values for the variables that are scattered and repeated throughout in the main terraform file
- Main.tf - the main terraform file containing all the configuration language needed to make the API calls and create the objects in AWS
- Arun.png - a picture file that will be used for the website
- Index.html - the html code that at a very basic level calls the picture file to have it rendered for our website
While this example uses a single main.tf file, I recommend splitting your Terraform code into separate files (subnets.tf, security_groups.tf, etc.) for larger projects. This makes troubleshooting and maintenance much easier.
Configuration Files
index.html
Basic html code that references the arun.png file.
terraform.tfvars
- The access key and secret key are obtained when you create an IAM user with programmatic access in AWS
- The key pair is your key to access any EC2 instances that it is used to create
- The pem file is downloaded when you create your key pair
- The bucket name prefix is the first part of the S3 bucket; the second part will be a random number (as you will see later with the rand terraform function). S3 bucket names have to be globally unique across all of AWS
- The environment tag will be used to create a tag for all the AWS objects
- The billing code will also be used to create a tag for all AWS objects - great for billing and reporting
main.tf Breakdown
This file is where the magic happens. Let's dissect it section by section:
Variables
This portion ties directly to the terraform.tfvars file. The variables section gives you the plate and terraform.tfvars fills it with the respective data. Default values are used when data isn't present in terraform.tfvars (e.g., network_address_space defaults to 10.1.0.0/16).
Providers
This section defines the provider or plugin for the vendor where objects will be created. For our sample, we're using AWS and telling Terraform to use the programmatic keys and region from our variables.
Locals
This section creates local items that can be referenced throughout the code. Here we're creating a tags block and the s3 bucket name that's interpolated using variable data and the Terraform random function.
Data Sources
Data sources allow us to use predefined items in AWS that are outside our control. Here we fetch available AZs and the most recent Amazon Linux AMI with EBS storage.
Resources
The resource block is where the party starts. Each block represents an AWS object created when Terraform runs.
Security groups in this example allow SSH from anywhere (0.0.0.0/0). In production, restrict this to your specific IP addresses or use a bastion host.
Networking
Creates VPC, Internet Gateway, and Subnets. Each object is tagged using merged common tags. Subnets are configured to auto-assign public IP addresses.
Security Groups
Two security groups are created:
- EC2 Security Group - Allows SSH from your IP and HTTP from within the VPC only
- ALB Security Group - Allows HTTP from the internet and all outbound traffic
Load Balancer
Creates a public-facing load balancer across both subnets, with a listener for port 80 and a target group for the NGINX servers.
NGINX Servers
Two EC2 instances are created with provisioners that:
- Install nginx and s3cmd
- Configure s3cfg for S3 access using the IAM role
- Pull website files from S3
- Configure log rotation to push logs to S3
S3 Configuration
Creates an IAM role, instance profile, and private S3 bucket. Website files are uploaded to S3 for EC2 instances to pull.
Terraform Commands
terraform init
Initializes your configuration by downloading the AWS provider plugin. Run this first in any new Terraform project.
terraform plan -out myplan.plan
Compares your configuration against existing infrastructure and shows what will be added (green +), modified (yellow ~), or deleted (red -). Always review the plan before applying.
terraform apply myplan.plan
Executes the plan and creates your resources in AWS. Terraform will ask for confirmation before proceeding.
terraform destroy
Removes all resources managed by Terraform. Use with caution in production environments.
Troubleshooting
- "Error: Error creating S3 Bucket: BucketAlreadyExists" - S3 bucket names must be globally unique. The random_integer resource should prevent this, but if it occurs, change your bucket_name_prefix.
- EC2 instances can't pull from S3 - Verify the IAM role is properly attached to the instances. Check the instance profile association in the EC2 console.
- Can't SSH to EC2 instances - Ensure your IP address is correctly specified in the security group, and that your key pair PEM file has the correct permissions (chmod 400).
- Load balancer returns 502 Bad Gateway - NGINX may not have started properly. SSH into the instances and check
sudo systemctl status nginx. - Terraform provisioner fails - Remote-exec provisioners require SSH connectivity. Ensure security groups allow SSH from where Terraform runs.
- "Error: Error launching source instance" - The AMI ID may be outdated or unavailable in your region. The data source should find the latest, but check AWS console for available AMIs.
Conclusion
In this post, we used Terraform to create a configuration file that allowed us to connect to AWS, create our network and supporting cast, create a couple of EC2 instances to act as web servers that sat behind a load balancer, and pulled website files from an S3 bucket that we created and populated, also from the Terraform configuration file. Of course there are other ways, as well as other tools that can work in parallel with Terraform, to achieve the same thing; however I hope this post helps those that are new to Terraform and/or those that are looking for the rinse and repeatability of the final goal rather than execute tons and tons of keyboard and mouse clicks using the UI or CLI.
Full Code
######################################################################
# VARIABLES
######################################################################
variable "aws_access_key" {}
variable "aws_secret_key" {}
variable "private_key_path" {}
variable "key_name" {}
variable "region" {
default = "us-east-1"
}
variable "network_address_space" {
default = "10.1.0.0/16"
}
variable "subnet1_address_space" {
default = "10.1.0.0/24"
}
variable "subnet2_address_space" {
default = "10.1.1.0/24"
}
variable "bucket_name_prefix" {}
variable "environment_tag" {}
######################################################################
# PROVIDERS
######################################################################
provider "aws" {
access_key = var.aws_access_key
secret_key = var.aws_secret_key
region = var.region
}
######################################################################
# LOCALS
######################################################################
locals {
common_tags = {
Environment = var.environment_tag
}
s3_bucket_name = "${var.bucket_name_prefix}-${var.environment_tag}-${random_integer.rand.result}"
}
######################################################################
# DATA
######################################################################
data "aws_availability_zones" "available" {}
data "aws_ami" "aws-linux" {
most_recent = true
owners = ["amazon"]
filter {
name = "name"
values = ["amzn-ami-hvm*"]
}
filter {
name = "root-device-type"
values = ["ebs"]
}
filter {
name = "virtualization-type"
values = ["hvm"]
}
}
######################################################################
# RESOURCES
######################################################################
#Random ID
resource "random_integer" "rand" {
min = 100
max = 999
}
# NETWORKING #
resource "aws_vpc" "vpc" {
cidr_block = var.network_address_space
tags = merge(local.common_tags, { Name = "${var.environment_tag}-vpc" })
}
resource "aws_internet_gateway" "igw" {
vpc_id = aws_vpc.vpc.id
tags = merge(local.common_tags, { Name = "${var.environment_tag}-igw" })
}
resource "aws_subnet" "subnet1" {
cidr_block = var.subnet1_address_space
vpc_id = aws_vpc.vpc.id
map_public_ip_on_launch = "true"
availability_zone = data.aws_availability_zones.available.names[0]
tags = merge(local.common_tags, { Name = "${var.environment_tag}-subnet1" })
}
resource "aws_subnet" "subnet2" {
cidr_block = var.subnet2_address_space
vpc_id = aws_vpc.vpc.id
map_public_ip_on_launch = "true"
availability_zone = data.aws_availability_zones.available.names[1]
tags = merge(local.common_tags, { Name = "${var.environment_tag}-subnet2" })
}
# ROUTING #
resource "aws_route_table" "rtb" {
vpc_id = aws_vpc.vpc.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.igw.id
}
tags = merge(local.common_tags, { Name = "${var.environment_tag}-rtb" })
}
resource "aws_route_table_association" "rta-subnet1" {
subnet_id = aws_subnet.subnet1.id
route_table_id = aws_route_table.rtb.id
}
resource "aws_route_table_association" "rta-subnet2" {
subnet_id = aws_subnet.subnet2.id
route_table_id = aws_route_table.rtb.id
}
# SECURITY GROUPS #
resource "aws_security_group" "alb-sg" {
name = "nginx_elb_sg"
vpc_id = aws_vpc.vpc.id
#Allow HTTP from anywhere
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
#allow all outbound
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = merge(local.common_tags, { Name = "${var.environment_tag}-alb" })
}
# Nginx security group
resource "aws_security_group" "nginx-sg" {
name = "nginx_sg"
vpc_id = aws_vpc.vpc.id
# SSH access from anywhere
ingress {
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
# HTTP access from the VPC
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = [var.network_address_space]
}
# outbound internet access
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = merge(local.common_tags, { Name = "${var.environment_tag}-nginx" })
}
# LOAD BALANCER #
resource "aws_lb_target_group" "albtg" {
name = "alb-lb-tg"
port = 80
protocol = "HTTP"
vpc_id = aws_vpc.vpc.id
}
resource "aws_lb" "alb" {
name = "alb"
internal = false
load_balancer_type = "application"
security_groups = [aws_security_group.alb-sg.id]
subnets = [aws_subnet.subnet1.id,aws_subnet.subnet2.id]
}
resource "aws_lb_target_group_attachment" "targetattachment1" {
target_group_arn = aws_lb_target_group.albtg.arn
target_id = aws_instance.nginx1.id
port = 80
}
resource "aws_lb_target_group_attachment" "targetattachment2" {
target_group_arn = aws_lb_target_group.albtg.arn
target_id = aws_instance.nginx2.id
port = 80
}
resource "aws_lb_listener" "front_end" {
load_balancer_arn = aws_lb.alb.arn
port = "80"
protocol = "HTTP"
default_action {
type = "forward"
target_group_arn = aws_lb_target_group.albtg.arn
}
}
# INSTANCES #
resource "aws_instance" "nginx1" {
ami = data.aws_ami.aws-linux.id
instance_type = "t2.micro"
subnet_id = aws_subnet.subnet1.id
vpc_security_group_ids = [aws_security_group.nginx-sg.id]
key_name = var.key_name
iam_instance_profile = aws_iam_instance_profile.nginx_profile.name
connection {
type = "ssh"
host = self.public_ip
user = "ec2-user"
private_key = file(var.private_key_path)
}
provisioner "file" {
content = <<EOF
access_key =
secret_key =
security_token =
use_https = True
bucket_location = US
EOF
destination = "/home/ec2-user/.s3cfg"
}
provisioner "file" {
content = <<EOF
/var/log/nginx/*log {
daily
rotate 10
missingok
compress
sharedscripts
postrotate
endscript
lastaction
INSTANCE_ID=`curl --silent http://169.254.169.254/latest/meta-data/instance-id`
sudo /usr/local/bin/s3cmd sync --config=/home/ec2-user/.s3cfg /var/log/nginx/ s3://${aws_s3_bucket.web_bucket.id}/nginx/$INSTANCE_ID/
endscript
}
EOF
destination = "/home/ec2-user/nginx"
}
provisioner "remote-exec" {
inline = [
"sudo yum install nginx -y",
"sudo service nginx start",
"sudo cp /home/ec2-user/.s3cfg /root/.s3cfg",
"sudo cp /home/ec2-user/nginx /etc/logrotate.d/nginx",
"sudo pip install s3cmd",
"s3cmd get s3://${aws_s3_bucket.web_bucket.id}/website/index.html .",
"s3cmd get s3://${aws_s3_bucket.web_bucket.id}/website/arun.png .",
"sudo rm /usr/share/nginx/html/index.html",
"sudo cp /home/ec2-user/index.html /usr/share/nginx/html/index.html",
"sudo cp /home/ec2-user/arun.png /usr/share/nginx/html/arun.png",
"sudo logrotate -f /etc/logrotate.conf"
]
}
tags = merge(local.common_tags, { Name = "${var.environment_tag}-nginx1" })
}
resource "aws_instance" "nginx2" {
ami = data.aws_ami.aws-linux.id
instance_type = "t2.micro"
subnet_id = aws_subnet.subnet2.id
vpc_security_group_ids = [aws_security_group.nginx-sg.id]
key_name = var.key_name
iam_instance_profile = aws_iam_instance_profile.nginx_profile.name
connection {
type = "ssh"
host = self.public_ip
user = "ec2-user"
private_key = file(var.private_key_path)
}
provisioner "file" {
content = <<EOF
access_key =
secret_key =
security_token =
use_https = True
bucket_location = US
EOF
destination = "/home/ec2-user/.s3cfg"
}
provisioner "file" {
content = <<EOF
/var/log/nginx/*log {
daily
rotate 10
missingok
compress
sharedscripts
postrotate
endscript
lastaction
INSTANCE_ID=`curl --silent http://169.254.169.254/latest/meta-data/instance-id`
sudo /usr/local/bin/s3cmd sync --config=/home/ec2-user/.s3cfg /var/log/nginx/ s3://${aws_s3_bucket.web_bucket.id}/nginx/$INSTANCE_ID/
endscript
}
EOF
destination = "/home/ec2-user/nginx"
}
provisioner "remote-exec" {
inline = [
"sudo yum install nginx -y",
"sudo service nginx start",
"sudo cp /home/ec2-user/.s3cfg /root/.s3cfg",
"sudo cp /home/ec2-user/nginx /etc/logrotate.d/nginx",
"sudo pip install s3cmd",
"s3cmd get s3://${aws_s3_bucket.web_bucket.id}/website/index.html .",
"s3cmd get s3://${aws_s3_bucket.web_bucket.id}/website/arun.png .",
"sudo rm /usr/share/nginx/html/index.html",
"sudo cp /home/ec2-user/index.html /usr/share/nginx/html/index.html",
"sudo cp /home/ec2-user/arun.png /usr/share/nginx/html/arun.png",
"sudo logrotate -f /etc/logrotate.conf"
]
}
tags = merge(local.common_tags, { Name = "${var.environment_tag}-nginx2" })
}
# S3 Bucket config#
resource "aws_iam_role" "allow_nginx_s3" {
name = "allow_nginx_s3"
assume_role_policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Principal": {
"Service": "ec2.amazonaws.com"
},
"Effect": "Allow",
"Sid": ""
}
]
}
EOF
}
resource "aws_iam_instance_profile" "nginx_profile" {
name = "nginx_profile"
role = aws_iam_role.allow_nginx_s3.name
}
resource "aws_iam_role_policy" "allow_s3_all" {
name = "allow_s3_all"
role = aws_iam_role.allow_nginx_s3.name
policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"s3:*"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::${local.s3_bucket_name}",
"arn:aws:s3:::${local.s3_bucket_name}/*"
]
}
]
}
EOF
}
resource "aws_s3_bucket" "web_bucket" {
bucket = local.s3_bucket_name
acl = "private"
force_destroy = true
tags = merge(local.common_tags, { Name = "${var.environment_tag}-web-bucket" })
}
resource "aws_s3_bucket_object" "website" {
bucket = aws_s3_bucket.web_bucket.bucket
key = "/website/index.html"
source = "./index.html"
}
resource "aws_s3_bucket_object" "graphic" {
bucket = aws_s3_bucket.web_bucket.bucket
key = "/website/arun.png"
source = "./arun.png"
}
######################################################################
# OUTPUT
######################################################################
output "aws_alb_public_dns" {
value = aws_lb.alb.dns_name
}