MLOps Basics [Week 5]: Model Packaging – DockerNovember 5, 2021 2021-11-05 23:40
MLOps Basics [Week 5]: Model Packaging – Docker
Why do we need packaging? We might have to share our application with others, and when they try to run the application most of the time it doesn’t run due to dependencies issues / OS related issues and for that, we say (famous quote across engineers) that
It works on my laptop/system.
So for others to run the applications they have to set up the same environment as it was run on the host side which means a lot of manual configuration and installation of components.
The solution to these limitations is a technology called
By containerizing/packaging the application, we can run the application on any cloud platform to get advantages of managed services and autoscaling and reliability, and many more.
The most prominent tool to do the packaging of application is
In this post, I will be going through the following topics:
Basics of Docker
Building Docker Container
⚡️ FastAPI wrapper
We have built a nice application. But it is of no use if there is not a way to access by other people. Let’s build a API around the application we built.
I will be using FastAPI for this purpose (since it is fast and very easy to get started).
pip install fastapi
pip install uvicorn
Create a file called
app.py and import FastAPI as following:
from fastapi import FastAPI
app = FastAPI(title="MLOps Basics App")
Now we can use this app to create the required routes
async def home():
return "<h2>This is a sample NLP Project</h2>"
Now Let’s run the application using the command
uvicorn app:app --ip 0.0.0.0 --port 8000 --reload
Now open the browser and go to http://localhost:8000/
The advantage of using
FastAPI ⚡️ it has inbuilt swagger documentation linking. Open this link http://localhost:8000/docs
It will look something like this:
Now let’s create an API which will take
text as input and returns the prediction.
from inference_onnx import ColaONNXPredictor
# load the model
predictor = ColaONNXPredictor("./models/model.onnx")
async def get_prediction(text: str):
result = predictor.predict(text)
Run the application using the command:
uvicorn app:app --ip 0.0.0.0 --port 8000 --reload
This http://localhost:8000/docs will look like this:
Now you run the get the predictions directly by calling this
🛳 Basics of Docker
Docker is a container management tool, which packages the application code, configuration, and dependencies into a portable image that can be shared and run on any platform or system.
There are three three main things in docker:
Docker File contains the list of commands to run which are necessary for the application to run (like dependencies, codes, command to run etc.)
Docker Image is a lightweight, standalone, executable package of software (built using dockerfile) that includes everything needed to run an application: code, runtime, system tools, system libraries, and settings.
Docker Container is an instance of Docker Image which contains the running application.
🛳 Building Docker Container
Let’s package the application using docker.
Note: I am doing this only for inference (trained model) can also be used for training.
📝 Docker File
A simple file that consists of instructions to build a Docker Image. Each instruction in a docker file is a command/operation, for example, what operating system to use, what dependencies to install or how to compile the code, and many such instructions which act as a layer.
The best part is all layers are cached and if we modified some instructions in the Dockerfile then during the build process it will just rebuild the changed layer.
A sample Dockerfile will look as below
COPY ./ /app
RUN pip install -r requirements_inference.txt
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]
FROM: creates a layer from the base image, here we have used
huggingface/transformers-pytorch-cpu:latestDocker image. This is pulled from Docker Hub.
WORKDIR: specifies the working directory
COPY: adds files from your Docker client’s or host system current directory, here we are adding current directory files to the container’s app directory
RUN: specifies what commands to run within the container, here running pip command to install dependencies from
EXPOSE: specifies which port to expose our application, here it’s 8000
CMD: specifies what command to run at the start of the container
🌉 Docker Image
Now let’s build the Docker Image 🌉 using the command:
docker build -t inference:latest .
Once the image is built, cross check it using the command:
You will see output something like this:
REPOSITORY TAG IMAGE ID CREATED SIZE
inference latest d4aa9dad9eaa 1 hours ago 3.12GB
📦 Docker Container
Now let’s create the container from the image created using the command:
docker run -p 8000:8000 --name inference_container inference:latest
You are saying to create a docker container with name
inference_container using the docker image
infenrece:latest. Tie the port
8000 with the port
8000 of the docker container. You will see output something like:
(week5) ravirajas-MacBook-Pro MLOps-Basics/week_5_docker ‹week5*› » docker run -p 8000:8000 --name inference_container inference:latest
Downloading: 100%|██████████| 382/382 [00:00<00:00, 201kB/s]
Downloading: 100%|██████████| 232k/232k [00:00<00:00, 583kB/s]
INFO: Started server process 
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
🚢 Docker Compose
Docker Compose is a tool for defining and running multi-container Docker applications. With Compose, you use a YAML file to configure your application’s services. Then, with a single command, you create and start all the services from your configuration.
Let’s see how to create a
compose yaml file:
Services: These are the multiple docker containers which needs to be created (in our case it’s only one)
Now simply run the command:
You will output something like:
(base) ravirajas-MacBook-Pro MLOps-Basics/week_5_docker ‹week5*› » docker-compose up
Starting prediction_api ... done
Attaching to inference_container
inference_container | INFO: Started server process 
inference_container | INFO: Waiting for application startup.
inference_container | INFO: Application startup complete.
inference_container | INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
Now sharing can be done in 2 ways:
- Commit the
Dockerfilewhich can be used to build the image and container
- Push the docker image to a central repository like Docker Hub and pull it from there.
You need to create an account in docker hub in-order to be able to push the image. Refer to this documentation for more information.
This concludes the post. These are only a tip of the mountain capabilities of FastAPI and Docker. For more information and other usecases, refer to the documentations.
Complete code for this post can also be found here: Github