Let's translate the understanding of Docker and containerization principles into practice by packaging the machine learning API we've developed. This hands-on exercise assumes you have a functional FastAPI application structured similarly to what we discussed in Chapter 4, including your trained model artifact (e.g., a .joblib
or .pkl
file), and that you have Docker installed and running on your system.
Before creating the Dockerfile
, let's review a typical project layout. Your structure might resemble this:
my_ml_api/
├── app/
│ ├── __init__.py
│ ├── main.py # FastAPI app instance, imports routers
│ ├── api/ # Routers define endpoints
│ │ ├── __init__.py
│ │ └── predict.py # Contains the /predict endpoint
│ ├── models/ # Pydantic models for request/response
│ │ └── __init__.py
│ └── core/ # Model loading and inference logic
│ ├── __init__.py
│ └── inference.py
├── models/ # Directory for ML model artifacts
│ └── sentiment_model.joblib # Example model file
├── tests/ # Application tests
│ └── ...
├── Dockerfile # We will create this file
└── requirements.txt # Python dependencies
Here's a diagram illustrating this structure:
Project structure showing application code (
app/
), model artifacts (models/
), tests (tests/
), and configuration files (Dockerfile
,requirements.txt
).
Ensure your ML model file (e.g., sentiment_model.joblib
) is present in the models/
directory at the root level, and your requirements.txt
accurately lists all necessary packages (like fastapi
, uvicorn
, scikit-learn
, joblib
, pydantic
).
Navigate to the root directory of your project (my_ml_api/
) in your terminal. Create a new file named Dockerfile
(with no extension) and open it in your text editor.
Add the following content to your Dockerfile
, step by step:
# 1. Base Image: Use an official Python runtime as a parent image
# We use a specific version for reproducibility and the 'slim' variant for smaller size.
FROM python:3.9-slim
# 2. Set Working Directory: Define the working directory inside the container
WORKDIR /app
# 3. Copy Dependencies File: Copy the requirements file first to leverage Docker cache
COPY requirements.txt requirements.txt
# 4. Install Dependencies: Install Python dependencies specified in requirements.txt
# --no-cache-dir reduces image size, --upgrade pip ensures we have the latest pip
RUN pip install --no-cache-dir --upgrade pip && \
pip install --no-cache-dir -r requirements.txt
# 5. Copy Application Code and Model: Copy the rest of the application code and the ML model
COPY ./app /app/app
COPY ./models /app/models
# 6. Expose Port: Inform Docker that the container listens on port 8000 at runtime
EXPOSE 8000
# 7. Command to Run: Specify the command to run the application using Uvicorn
# We use 0.0.0.0 to make the server accessible from outside the container.
# The port 8000 matches the EXPOSE instruction.
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
Let's break down these instructions:
FROM python:3.9-slim
: Specifies the base image. Using a specific version tag (3.9-slim
) ensures consistency and the slim
variant helps keep the image size down.WORKDIR /app
: Sets the default directory for subsequent commands within the container.COPY requirements.txt requirements.txt
: Copies only the requirements file first. Docker caches layers; if requirements.txt
doesn't change, the RUN pip install
layer can be reused from the cache during subsequent builds, speeding things up.RUN pip install ...
: Executes the installation of dependencies. Using && \
chains commands, and --no-cache-dir
prevents pip from storing cache, reducing the final image size.COPY ./app /app/app
and COPY ./models /app/models
: Copies your application source code (app
directory) and the machine learning models (models
directory) into the specified locations within the container's working directory (/app
).EXPOSE 8000
: Documents that the application inside the container will listen on port 8000. This doesn't actually publish the port; it's more for information and can be used by automated systems.CMD ["uvicorn", ...]
: Defines the default command to execute when a container starts from this image. It starts the Uvicorn server, telling it to run the FastAPI application instance (app
) found in app/main.py
, listen on all available network interfaces (0.0.0.0
), and use port 8000.With the Dockerfile
created in your project's root directory, you can now build the Docker image. Open your terminal, navigate to the project root (my_ml_api/
), and run the following command:
docker build -t my-ml-api:latest .
Let's dissect this command:
docker build
: The command to build an image from a Dockerfile
.-t my-ml-api:latest
: The -t
flag tags the image. We're naming it my-ml-api
and giving it the tag latest
. You can use other tags like version numbers (e.g., my-ml-api:0.1.0
). Tagging makes it easier to manage and reference images..
: This indicates the build context, which is the current directory. Docker sends the files and folders in this directory (respecting .dockerignore
if present) to the Docker daemon to use during the build process.You will see Docker executing each step defined in your Dockerfile
. This might take a few minutes the first time, especially during the dependency installation step. Subsequent builds will be faster if dependencies haven't changed, thanks to Docker's layer caching.
Once completed, you can verify the image was created by listing your local Docker images:
docker images
You should see my-ml-api
with the tag latest
in the list.
Now that you have the image, you can run it as a container:
docker run -d -p 8000:8000 --name ml-api-container my-ml-api:latest
Explanation of the flags:
docker run
: The command to create and start a container from an image.-d
: Runs the container in detached mode (in the background) and prints the container ID.-p 8000:8000
: Publishes the container's port to the host. It maps port 8000 on your host machine to port 8000 inside the container (where Uvicorn is listening, as specified by EXPOSE
and CMD
). The format is host_port:container_port
.--name ml-api-container
: Assigns a recognizable name to the running container, making it easier to manage (e.g., check logs, stop).my-ml-api:latest
: Specifies the image to run.You can check if your container is running using:
docker ps
This command lists all running containers. You should see ml-api-container
listed, showing it's up and running, and displaying the port mapping 0.0.0.0:8000->8000/tcp
.
Now, test your API. Open your web browser or use a tool like curl
to send a request to your prediction endpoint, which is now accessible via your host machine's port 8000:
# Example using curl, assuming your endpoint is /predict
# and accepts JSON input like {"text": "some input"}
curl -X POST "http://localhost:8000/predict" \
-H "Content-Type: application/json" \
-d '{"text": "FastAPI is great for ML models!"}'
Replace the URL path (/predict
) and the data (-d '...'
) with the specifics of your API endpoint and expected input format. You should receive the prediction response from your model, served by the FastAPI application running inside the Docker container.
To view the logs generated by the application inside the container (useful for debugging), use the docker logs
command followed by the container name:
docker logs ml-api-container
To stop the container:
docker stop ml-api-container
To remove the container (after stopping):
docker rm ml-api-container
You have successfully containerized your FastAPI machine learning application! You now have a self-contained Docker image that includes your application code, dependencies, and ML model. This image can be run consistently across different environments, forming the foundation for reliable deployment.
© 2025 ApX Machine Learning