Picture of the author
Jarred Kenny
Published on

Pretty URLs with AWS Cloudfront

Authors

I recently migrated a handful of the websites I host from a VPS I've had running on Digital Ocean for years over to Amazon Web Services. The sites that I needed to continue hosting were sites composed of static files, so I opted to store them in S3 and serve them out to the world using CloudFront.

Amazon CloudFront is a content delivery network offered by Amazon Web Services. Content delivery networks provide a globally-distributed network of proxy servers which cache content, such as web videos or other bulky media, more locally to consumers, thus improving access speed for downloading the content.

A content delivery system such as Cloudfront is great for serving static files quickly to viewers around the world and the pay per request cost model is ideal for the sites I host which do not receive massive amounts of traffic.

The Problem

I have used Cloudfront extensively to serve content but always as a component of a larger application. I had never tried to use Cloudfront to host a proper website as I would nginx or apache. I quickly noticed that some of the features I've come to take for granted in modern web servers were missing. Most importantly, what is known as "Pretty URLs". If you have any level of experience in Wordpress you've no doubt come across the term.

Put simply, "Pretty URLs" allow for a default document to be served from a path when none is specified. For example, if the users navigated to https://website.com/my-awesome-article the web server would actually serve https://website.com/my-awesome-article/index.html behind the scenes. If you are running apache or nginx implementing this behavior is usually achieved by adding a Rewrite Rule of some kind to the configuration responsible for serving your site. I was shocked to discover that despite Cloudfront's in depth set of options and customization, it was not possible to achieve the same behavior by simply tweaking some settings on the Cloudfront distribution.

How do we do it with CloudFront?

After some research, I discovered this can be achieved using Lambda@Edge functions. If you are new to AWS or have not yet discovered the power of Lambda, Lambda is Amazons Functions as a Service (FaaS) offering and it allows you to deploy functions rather than servers which are invoked when needed to preform tasks or serve an application or backend.

One feature of Lambda is Lambda@Edge functions which allow you to run Lambda functions directly on Cloudfront edge servers when requests are made. These can be used to modify or reject the request as it is made or kick off some additional task or automation when a request occurs. This is an incredibly powerful tool as it allows you to pragmatically control how requests are handled in the language of your choice while also providing the benefits of a globally distributed CDN.

I set out to write a Lambda function which would modify incoming requests in such a way that would allow me to serve my pages without including that ugly /index.html or my-post.html at the end of my URLs. Rather than reinventing the wheel I discovered I am obviously not the first person who has had to solve this problem and found lambda-edge-nice-urls which can be deployed in Lambda and assigned to a Cloudfront distribution.

This small function implemented in javascript does everything I needed!

/* Public domain project by Cloud Under (https://cloudunder.io).
 * Repository: https://github.com/CloudUnder/lambda-edge-nice-urls
 */

const config = {
  suffix: '.html',
  appendToDirs: 'index.html',
  removeTrailingSlash: false,
}

const regexSuffixless = /\/[^/.]+$/ // e.g. "/some/page" but not "/", "/some/" or "/some.jpg"
const regexTrailingSlash = /.+\/$/ // e.g. "/some/" or "/some/page/" but not root "/"

exports.handler = function handler(event, context, callback) {
  const { request } = event.Records[0].cf
  const { uri } = request
  const { suffix, appendToDirs, removeTrailingSlash } = config

  // Append ".html" to origin request
  if (suffix && uri.match(regexSuffixless)) {
    request.uri = uri + suffix
    callback(null, request)
    return
  }

  // Append "index.html" to origin request
  if (appendToDirs && uri.match(regexTrailingSlash)) {
    request.uri = uri + appendToDirs
    callback(null, request)
    return
  }

  // Redirect (301) non-root requests ending in "/" to URI without trailing slash
  if (removeTrailingSlash && uri.match(/.+\/$/)) {
    const response = {
      // body: '',
      // bodyEncoding: 'text',
      headers: {
        location: [
          {
            key: 'Location',
            value: uri.slice(0, -1),
          },
        ],
      },
      status: '301',
      statusDescription: 'Moved Permanently',
    }
    callback(null, response)
    return
  }

  // If nothing matches, return request unchanged
  callback(null, request)
}

By using this Lambda function to modify incoming requests on my Cloudfront distribution I was able to achieve the Pretty URLs I desired while still leveraging a content delivery network but without having to keep a web server running 24/7. In fact, if you are reading this post your request was likely processed by this function.