Previously, AWS Lambda installation packages were limited to an unpacked size of up to 250 MB, including requirements. This proved to be an obstacle when trying to host machine learning models using the service, as generic ML libraries and complex models led to installation packages exceeding the 250MB limit.
However, in December 2020 AWS announced support for sConnecting and Enabling Lambda Functions as a Docker Images. Critically for machine learning, these images can be up to 10GB in size. This means that large dependencies (e.g., Tensorflow) and medium-sized models can be included in the image and thus model predictions can be used with Lambda.
This article discusses example development and deployment for model hosting in Lambda. All the necessary codes can be found here.
The solution can be divided into three parts:
- Docker image: The docking image includes our dependency, a trained model system, and a function code. AWS provides the 189097 with a basic picture of several times on which compatibility with the service can be built.
- Lambda function: A non-server resource that executes action code based on incoming events / requests in the Docker image.
- API gateway endpoint: Used as a trigger for Lambda and as a starting point for customer requests. When prediction requests are received at the endpoint, the Lambda function is triggered when the request body is included in the event sent to the function. The value returned by the function is returned to the client in response.
In this example, our model is a simple KNN implementation trained in the Iris classification file. Training is not included in this message, but the result is a scikit-learn Pipeline object consisting of the following objects:
1. StandardScaler: Standardizes bets based on the mean and standard deviation of training samples.
2. KNeighborsClassifier: The actual prefabricated model. Practiced at K = 5.
The tubes are saved using the Joblib implementation of scikit-learntemplate_pipe.joblib“.
Let’s start by looking at the action code that Lambda uses to process anticipation request events (predict app.py).
Lambda_handler are the required arguments for the functions used by lambda. The pipe target is loaded outside the handler to avoid loading this on each call. Lambda keeps the tanks alive for a while until there are no events, so loading a template once to create means it can be reused while Lambda keeps the tank alive.
The handler function itself is quite simple; the required revenue is unloaded from the transaction framework and used to create the forecast. The prediction is returned as part of the JSON response. The response of the function is returned to the client through the API gateway. A few checks are made to ensure that the inputs are in line with expectations and any forecast errors are detected and recorded.
The docker file is structured as follows:
1. Drag the basic Python 3.6 image for AWS
2. Copy the required files from the local directory to the root of the image
3. Installation requirements
4. Perform the handler function
We use the AWS Serverless Application Manager (SAM) interface to manage deployment and AWS resources. Instructions for installing SAM and its dependencies can be found here.
To use SAM for construction and deployment, a sample file must be specified. This is used to configure the required AWS resources and related configurations.
The SAM model for this project includes the following:
- Various information such as stack name and general Lambda timeout configuration.
- MLPredictionFunction Resource: This is the Lambda feature we want to enable. This section contains most of the required configuration:
- Properties: here we specify that the function is configured using the Docker image (PackageType: Image) and that the function is started via the API gateway (Type: API). The path name and type of the API path are also specified here.
- The metadata contains the tag used in the constructed images, as well as the location / name of the Docker file used to construct the image.
- The outputs list all the necessary resources that SAM creates. In this case, the SAM needs the API gateway endpoint, the Lambda function, and the associated IAM role.
Executing the command below builds the application image using a locally defined SAM model:
If the operation is successful, the operation can be called locally with an example event with the following command (see, for example, the event Repo):
!sam local invoke -e events/event.json
Once the function has been tested locally, the image must be transferred to the AWS ECR. First create a new archive:
!aws ecr create-repository --repository-name ml-deploy-sam
You must log in to the ECR Managed Docker before the image can be pushed:
!aws ecr get-login-password --region <region> | docker login --username AWS --password-stdin <account id>.dkr.ecr.<region>.amazonaws.com
Now you can deploy your app in the following ways:
!sam deploy -g
This performs the deployment in a wizard mode where you must verify the application name, AWS area, and previously created image store. Accepting the default settings for the remaining settings should be great in most cases.
The installation process then begins and AWS resources are acquired. When completed, each resource will appear on the console.
- Image updates: To deploy an updated template or action code, simply build the image locally and run the deployment command again. SAM detects which aspects of the application have changed and updates the relevant resources accordingly.
- Cold start: Every time Lambda spins a container with our function code, the model is loaded before processing begins. This results in a cold start scenario where the first request is significantly slower than the following. One way to combat this is to launch the feature from time to time with CloudWatch so that the container is always ready while the template is loaded.
- Multiple functions: It is possible to enable multiple Lambda functions that serve a single API. This can be useful if you have multiple models you want to serve, or if you want a separate preprocessing / confirmation terminal. You can configure this simply by including Advanced features in the SAM template resources.