Serverless Thrift APIs in Python on AWS Lambda

I recently wrote a blog post called Serverless Thrift APIs in Python with AWS Lambda on the Zedge corporate blog, underneath is a reposting of it:

This blog post shows a basic example of a Serverless Thrift API with Python for AWS Lambda and AWS API Gateway.

1. Serverless Computing for Thrift APIs?

Serverless computing – also called Cloud Functions or Functions as a Service (FaaS) – is an interesting type of cloud service due to its simplicity. An interpretation of serverless computing is that you (with relatively low effort):

  1. Deploy only the function needing to do the work
  2. Only pay per request to the function
    1. With the notable exception of other cloud resources used, e.g. storage
  3. Get some security setup automation/support (e.g. SSL and API keys)
  4. Get support for request throttling (e.g. QPS) and quotas (e.g. per month)
  5. Get (reasonably) low latency – when the serverless function is kept warm
  6. Get support for easily setting up caching
  7. Get support for setting up custom domain name
  8. Lower direct (cloud costs) and indirect (management) costs?

These are characteristics that in my mind make Serverless computing an interesting infrastructure to develop and deploy Thrift APIs (or other types of APIs) for.
Perhaps over time even Serverless will be preferred over (more complex) container (Kubernetes/Docker) or virtual machine based (IaaS) or PaaS solutions for APIs?

2. Example Cloud Vendors providing Serverless Services

  1. AWS Lambda in combination with AWS API Gateway
  2. Google Cloud Functions
  3. IBM Bluemix Openwhisk (Apache Openwhisk)
  4. Microsoft Azure Functions

Since Python is a key language in my team, for this initial test I choose the AWS option also since I am most familiar with AWS and the open source tooling for AWS was best wrt Python (runner up was Microsoft Azure Functions).

3. Thrift (over HTTPS) on AWS Lambda and API Gateway with Python

This shows an example of the (classic) Apache Thrift tutorial Calculator API running on AWS Lambda and API Gateway, the service requires 2 thrift files:

  1. tutorial.thrift
  2. shared.thrift
3.1 Development Environment and Tools

The tool used for deployment in this blog post is Zappa, I recommend using Zappa together with Docker for Python 3.6 as described in this blog post, with a slight change of the Dockerfile if you want to build and compile Apache thrift Python library yourself, here is the altered Dockerfile. There hasn’t been official releases of Apache Thrift since 0.10.0 January 6th 2017, and there has been important improvement related to its Python support since last release – in particular the fix for supporting recursive thrift structs in Python

a. Dockerfile – for creating a Zappashell (same as Lambda runtime ) and builds Thrift

After building this Dockerfile (see command on top of file) and adding zappashell to your .bash_profile like this (source: the above mentioned blog post)

You can start your serverless deployment environment with the command zappashell (inside an new empty directory on your host platform e.g. a mac), this gives something like this – with an empty directory.

Install virtualenv and create/activate an environment(and assuming you installed thrift as shown in Dockerfile above)

Use thrift to generate python code for tutorial.thrift and shared.thrift

Convert the gen-py package into a python library (for convenient packaging) with a setup.py file as below (change version according to your wants)

setup.py

Copy the generated thrift library – note: thrift itself not the tutorial code – (ref thriftSomeVersion.tar.gz generated by python setup.by sdist in Dockerfile) to the same directory and add it to requirements.txt

requirements.txt should look something like this:

Run pip install -r requirements.txt

Create app.py that has code for calculator thrift

Create .aws directory with files:

credentials

config

Run zappa init and answers questions, it should look something like the image below:

you should now be able to deploy the API with

You can test the deployed API with the following client, remember to change the https address to the address that the deploy gave you

But wait, something is missing, this API is reachable by anyone. Let us add an API key (and update the client with the x-api-key). This can be done through AWS Console (and perhaps with Zappa itself through automation soon?) with the following steps:

Go to Amazon API Gateway Console and click on the generated API (perhaps named task-beta due to the Docker file path and the selected stage during zappa init)

Create a Usage Plan and associate it with the API (e.g. task-beta), then create an API Key (on the left side menu) and attach the API Key to the Usage Plan

Do a zappa update dev and and uncomment/update the transport.setCustomHeaders with x-api-key in the python client above to get authentication and throttling in place.

4. Conclusion

Have shown an example of getting thrift API running on Serverless that can relatively easily be automated, and when the API is initially created it is very little effort to update it (e.g. through continuous deployment).

A final note on roundtrip time performance, based on a few rough tests it looks like the roundtrip time for calls to API is around 300-400 milliseconds (with the test client based in Trondheim, Norway and accessing API Gateway in AWS and AWS Lambda in Germany), which is quite good. Believe that with an AWS Route53 Routing Policy one could have automatic selection of the closest AWS API Gateway/Lambda to get the lowest latency (note that one of the selections in zappa init was to deploy globally, but default was one availability zone).

Believe personally that Serverless computing has a strong future ahead wrt API development, and look forward to what cloud vendors software engineers/product managers add of new features, my wish list is:

  1. Strong Python support
  2. Built-in Thrift support and service discovery, as well as support for other RPC systems, e.g. gRPC, messagepack,++
  3. Improved software tooling for automation (e.g. simplified SSL/domain name setup/maintenance handling – get deep partnerships with letsencrypt.com for SSL certificates?)
  4. Increase caching support at various levels

Best regards,

Amund Tveit

Continue Reading