Data Engineering

April 8, 2022 • 4 min read

How to Build and deploy a Serverless REST API in 15 Minutes on AWS

Rédigé par Quentin Fevbre

Use AWS Lambda to build a Serverless REST API, storing data in S3 and querying it with Athena.

We’re going to build a Serverless REST API and deploy it on AWS without setting up any server!

Why use serverless?

The great thing about going serverless is to be able to deploy your code instantly without bothering to setup a server.

But after a while, you quickly realize that having stateless bits of codes in the cloud has its limits. For example, how do you persist your data?

In this article, We’ll build a REST API using AWS Lambda (python 3.6), that stores data on an S3 Bucket and then queries it using AWS Athena.

We’ll create the following API:

POST /user: Create a user
GET /user/{user_id}: Fetch the data matching the user_id
PUT /user/{user_id}: Update the data matching the user_id
DELETE /user/{user_id}: delete the data matching the user_id
GET /user/list: returns a list of all the users

TL;DR: To jump to the full working example, you can go there.

We will take the following steps:

Install the necessary toolkits and creating a serverless project
Write the serverless deployment configuration
Define a helper class S3Model to write and read data from the S3 Bucket created in the config file
Define a helper class S3ApiRaw to generate the API handlers
Create the user.py module with the User schema and generate the API handlers
Configure AWS Athena to be able to query our data using SQL statements

Let’s get started!

Step 0: The requirements

Install the Serverless toolkit
Configure your AWS credentials, you can find a tutorial here
Make sure you have python 3.6 installed (installation instructions can be found here)
You’re now ready to create your project:

The serverless-python-requirements plugin is used to ship the python dependency with the lambda, to do this all you need is to create a requirements.txt file

Step 1: Writing the deployment config

By creating a serverless project, a serverless.yml file was created. This config file will contain multiple key aspects of your project:

What Lambda function execute what piece of code (A function called a handler)
What route / parameters / HTTP method will trigger which Lambda function
What other AWS resources you want to create (one S3 Buckets in our case)
The permissions in order for the Lambda functions to interact with other AWS resources

In our case we will need to specify the following:

One S3 Bucket to store our data
Five Lambda functions: user_get, user_post, user_put, user_delete, user_list
The Lambda needs to be able to read and write on the S3 Bucket

Let’s take a look at what our serverless.yml will look like:

Given this configuration file, we now have to provide a python module user.py with 5 functions: get, post, put, delete and list.

Step 2: The S3Model class

An S3Model class is characterized by two class attributes:

a SCHEMA that will define the fields and field types of our object
a name that will specify an S3 folder in which the data will be stored

The class will have 5 methods:

load(id): to get data associated with an id from the bucket
save(object): to save data on the bucket
delete(id): to delete data associated with an id from the bucket
list_ids(): list all the ids on the bucket
validate(obj_data): This ensures that the data we save to the bucket is compliant with the JSON schema

Now that we can easily interact with some storage, the API handlers are pretty straightforward for a given an S3Model.

Step 3: The S3ApiRaw class

An S3ApiRaw class has one class attribute:

An S3Model class, that will manage the interaction with the bucket

This class will have 6 methods:

get, put, post, delete, all: 1 for each route
get_api_methods: that returns the 5 methods above

We also define one decorator handle_api_error that will format the return values for the API Gateway

Step 4: Define what a user looks like and get the API handlers

We define two classes:

User that inherits S3Model. By setting a name: user and a schema, we ensure that all the files in the folder “user” on the Bucket will be formatted as specified by the schema
UserResource that inherits S3ApiRaw. By setting the s3_model_class attribute to User, all the handlers defined above will be specific to the User model

Final Step: Deployment

One final thing we need before we deploy is to specify the requirement (boto3 is natively in the Lambda python runtime environment):

We are all set, let’s deploy:

And that’s it you now have a fully functional rest API!

Here are some command lines to play with your new API:

(I am using the httpie package, here is the github repository)

If you want to test some intensive requests, here is a snippet of code that creates 1000 users (using python asyncio module):

As you can see our API already works pretty nicely :)

Bonus Step: Improving perfs using AWS Athena to query S3 Data

As soon as you have a few hundreds objects, the /user/list route will timeout. This is because it tries to download iteratively all the files on the bucket.

This is where AWS Athena comes into play: AWS Athena’s documentation

This allows to query file data stored on S3 with common SQL SELECT statements.

In order to use Athena, we need to run queries. This is how it is done:

Launch a query execution
Wait for the execution to finish
Once the execution is done, fetch the results

Let’s write some helpers in manage Athena queries:

Now that we can execute SQL queries, we can create our Athena table. We’ll write a Lambda function that will execute CREATE statements.

For this, we need to make a few additions to the serverless.yml file:

We now need to write the init_schema schema function that:

Create a database: execute a CREATE DATABASE statement
Create a table: execute a CREATE DATABASE statement

We can now initialize the Athena database with the command:

serverless invoke local -f init_athena_schema

Now that the database is set up, we can modify our user.list function to make a query to the database:

And that’s it you can now query your resources as if they were in a database. You can find the full code here.

If you are looking for Data Engineering expert's, don't hesitate to contact us!

Thanks to Antoine Toubhans and Alexandre Chaintreuil.

Cet article a été écrit par

Quentin Fevbre