Running Testcafe with Lambda using container images

December 17th, 2020 Written by Ivan Seed

One of the constraints of Lambda used to be that the deployment package had to be provided as a zip archive which can only be up to 50MB, and the total unzipped size of the function and all of its Lambda Layers could not exceed 250MB. While this is not a major constraint for most workloads, it did restrict others and has also led to workarounds to get around this limitation which can be hard to maintain.

At AWS re:invent 2020, AWS announced support for Container Images running in Lambda. This adds the capability for us to package and deploy our functions using Docker container images that can be up to 10GB in size.

What problem does container images solve?

We have run into size limitations with Lambdas when working with some of our clients at Infinity Works, especially with data analysis workloads. Using larger Python libraries like the Snowflake connector and pandas means that you can reach the function size limit very quickly.

It is possible to work around this by bundling and compressing dependencies within the zip package. On initialisation, the Lambda function can then run code to decompress and output dependencies to the 512MB /tmp directory storage which each Lambda function has write access to. While a means to an end, it is not ideal as it creates an overhead for you to maintain. You can also use EFS by running your Lambda inside a VPC, but it still adds to your list of things to manage and maintain.

However, with the new container support, we’re able to package and deploy our code, binaries, libraries, runtime and all other dependencies as an image up to 10GB which greatly simplifies workflows and opens up new use cases.

AWS has provided several base images based on Amazon Linux. The base images come with the executables and essentials to get started, but we can also provide custom container images to Lambda that implement the AWS Lambda Runtime API.

Being able to provide and define custom images gives us granular control of the operating system and runtime environment. It allows us to reuse hardened base images, make use of other existing central governance requirements such as vulnerability scanning, and use familiar container tooling for development. This does mean the onus is on you to patch and secure the container.

Using Containers to run Testcafe

Let’s see how we can use container images to package and deploy our Lambda function. Testcafe is a hugely popular tool for automating end-to-end web testing. However, running it on Lambda can be challenging due to the size limitations and lack of granular control of the environment.

Running Chrome automation on Lambda is something that has been solved already – see for example https://github.com/alixaxel/chrome-aws-lambda. This makes use of the bundling and compression of dependencies as we touched on earlier, but how can using container images help simplify this problem?

Well, it is pretty easy:

*Since we won’t be using one of AWS’s base images, we need to install the runtime interface for Node.js which we can get through npm.

Here we have a simple function called handler which will perform the tests defined in /tests/homepage.js which is a demo test I’ve taken from Testcafe’s getting started page. The Dockerfile uses a Node 14 Alpine base image.

It is also worth mentioning the --disable-dev-shm-usage flag passed to Chromium. This ensures that Chrome writes files to /tmp instead of /dev/shm. When running applications in Lambda you have to ensure that any writing is configured to the /tmp filesystem.

We can now build this image and push it to ECR:

$ docker build -t lambda/testcafe .
$ docker tag lambda/testcafe {accountID}.dkr.ecr.{region}.amazonaws.com/lambda/testcafe:v0.0.1
$ docker push {accountID}.dkr.ecr.{region}.amazonaws.com/lambda/testcafe:v0.0.1

 

Using the AWS Console, go to the Lambda service (in the same region you have pushed your image to) and create a new function. Select the Container image option and give your function a name.

For the container image URI you can browse your images on ECR, select the lambda/testcafe image you have just pushed at then create your function.

Selecting an image from an ECR repository
Selecting an image from an ECR repository

It is also worth pointing out here that when you create a function from a container image it takes a few seconds or minutes for the Lambda platform to optimise the image depending on its size. This happens every time the function image is updated and the function will be in a PENDING state, your $LATEST version will still be pointing to the older image and switched after the optimisation is finished.

After the image has been optimised, the function will go into an ACTIVE state where it can be invoked just like a normal Lambda function. If the function has not been invoked for 14 days it will move into an INACTIVE state and will remain that way until it is invoked again. The next invoke will fail, and it will go back into the PENDING state again where it will be optimising the image.

You should bear this in mind if you are looking to use this approach with an event based architecture that may fire events few and far between. To mitigate this, ensure your Lambda function is called at least once every 14 days using something like CloudWatch scheduled events.

Allocate at least 500MB of memory to the Lambda and bump the timeout to at least 20 seconds to ensure the cold start has enough time to fully run. Create a dummy test event and invoke your Lambda function.

Lambda execution result
Lambda execution result

And like that we have Testcafe running on Lambda using an image that we supplied. We can see the test results listed in the log output showing that “My first test” passed, taking three seconds to execute.

We could take this example further by building on the tests to write synthetic user journeys that pull credentials from Parameter Store and login to our applications, this can be triggered by a CloudWatch Event Rule, or after each release. We can then trigger CloudWatch Alarms to publish to SNS to notify us if a core user journey has failed.

Conclusion

Having the ability to deploy functions as image containers is not a replacement to using ECS or Fargate, especially for long running workloads. But for those looking to benefit from event driven architecture, automatic scaling, cost optimisation with millisecond billing, and consistent performance it can be an alternative when appropriately used.

Overall, I think this is a great addition. It can greatly reduce the complexity to adopt Lambda for enterprise customers who already use container based workloads and are willing to take on the ownership of managing and maintaining container images.

For those wanting to learn more it is worth checking out AWS’s official announcement.

author-thumb
Written by Ivan Seed