The problem

This blog is a static website hosted in a S3 bucket which acts as an origin for a Cloudfront distribution. The article that you are currently reading is a post which is located on the URI path ‘/posts’. When I first deployed this website and I tried to access the posts which were displayed perfectly fine in my localhost, I came across the error below:

alt

Upon researching this, it became clear that this is an issue with static websites which have multiple subdirectories and index files per subdirectory. The landing page was displayed correctly by Cloudfront because I specified the default root object when I setup the distribution. However this only works on the root of the website. For any subdirectory, Cloudfront will do a S3 GetObject API call against a key that reflects the URI only and not the full path (e.g. posts/first-post/ instead of posts/first-post/index.html).

The solution

The solution for this issue is Lambda@Edge. It is a lambda function which can be associated with the Cloudfront cache behaviour and can modify the request before arriving at the origin.

A simple lambda function like the one below solves the problem by appending the index file to the request and therefore enabling Cloudfront to get the right index files from the S3 bucket.

def lambda_handler(event, context):
    request = event['Records'][0]['cf']['request']
    uri = request['uri']
    
    if uri.endswith('/'):
        request['uri'] += 'index.html'

    elif '.' not in uri:
        request['uri'] += '/index.html'

    return request

After deploying and publishing the function, you need to edit the behavior of the Cloudfront distribution and associate your function with the ‘Origin request’:

alt

One thing to keep in mind is the trust policy set for the lambda execution role, it does need to include the edgelambda.amazonaws.com service as a principal, otherwise you will not be able to associate the function with the Cloudfront distribution:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": [
                    "lambda.amazonaws.com",
                    "edgelambda.amazonaws.com"
                ]
            },
            "Action": "sts:AssumeRole"
        }
    ]
}

A potential improvement to Cloudfront’s functionality would be what web servers like Apache or Nginx already do, adding the index.html to the URI path automatically, even if the client does not specifically request it.