Now that you have a basic Dockerfile
structure for your FastAPI application, a significant aspect to address is how the machine learning model itself becomes part of the container image. For your API to function independently, it needs access to the trained model artifacts (like serialized model files, weights, or configuration files) within the container's filesystem. Let's look at the common strategies for achieving this.
The most straightforward method, especially during development or for simpler projects, is to include the model file directly within your project structure and copy it into the image during the build process.
Imagine your project layout looks something like this:
.
├── app
│ ├── main.py
│ └── ml_model.py
├── models
│ └── model_v1.joblib # Your serialized model
├── requirements.txt
└── Dockerfile
In this case, your Dockerfile
can use the COPY
instruction to place the model file into a designated location within the image:
# Start from a Python base image
FROM python:3.10-slim
# Set the working directory
WORKDIR /app
# Copy requirements first to leverage Docker cache
COPY requirements.txt requirements.txt
RUN pip install --no-cache-dir -r requirements.txt
# Copy the application code
COPY ./app /app/app
# --- Add this section to copy the model ---
# Create a directory for models inside the container
RUN mkdir /app/models
# Copy the model file from your host machine into the image
COPY ./models/model_v1.joblib /app/models/model_v1.joblib
# --- End of model copying section ---
# Command to run the application
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "80"]
Inside your FastAPI application (app/ml_model.py
or similar), you would then load the model using the path specified in the Dockerfile
(/app/models/model_v1.joblib
).
Advantages:
Disadvantages:
An alternative approach is to download the model from an external source during the Docker image build process. This source could be cloud storage (like AWS S3, Google Cloud Storage, Azure Blob Storage), a dedicated model registry, or even a simple HTTP URL.
You would use the RUN
instruction in your Dockerfile
along with tools like curl
, wget
, or cloud-specific command-line tools.
# Start from a Python base image
FROM python:3.10-slim
# Install necessary tools (e.g., curl) if not present in the base image
# RUN apt-get update && apt-get install -y curl && rm -rf /var/lib/apt/lists/* # Example for Debian-based images
# Set the working directory
WORKDIR /app
# Copy requirements first
COPY requirements.txt requirements.txt
RUN pip install --no-cache-dir -r requirements.txt
# Copy the application code
COPY ./app /app/app
# --- Add this section to download the model ---
# Define the model URL (could also use ARG for flexibility)
ARG MODEL_URL="https://your-storage-provider.com/models/model_v1.joblib"
# Create a directory for models
RUN mkdir /app/models
# Download the model using curl
RUN curl -o /app/models/model_v1.joblib ${MODEL_URL}
# --- End of model downloading section ---
# Command to run the application
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "80"]
You can pass the MODEL_URL
as a build argument when building the image:
docker build --build-arg MODEL_URL="<actual_model_url>" -t my-ml-api .
Advantages:
Dockerfile
logic significantly.Disadvantages:
curl
, awscli
, gcloud-sdk
, etc.) in the image just for the download step, potentially increasing image size unless you use multi-stage builds to discard these tools later.Docker volumes allow you to mount a directory from the host machine or a managed Docker volume into the container at runtime. While useful for local development (e.g., quickly testing different models without rebuilding the image), using volumes to provide the primary model in a production deployment scenario is generally not recommended.
It breaks the principle of having a self-contained, immutable image. The container's functionality becomes dependent on an external filesystem being correctly mounted at runtime, which complicates deployment orchestration and reproducibility. If the volume isn't mounted correctly, the application will fail. This approach is better suited for injecting configuration or perhaps runtime data, not core application artifacts like the model itself.
model_v1.2.joblib
). Updating the model requires changing the filename in the COPY
instruction and replacing the file.ARG
) to specify the model version or URL, making it easier to parameterize builds in CI/CD pipelines.Dockerfile
(e.g., /app/models/
) must match the path used by your FastAPI application code to load the model. Using environment variables for the model path within the application can add flexibility.python:3.X-slim
).apt-get clean
, rm -rf /var/lib/apt/lists/*
for Debian/Ubuntu; --no-cache-dir
for pip).--secret
flag) to avoid embedding sensitive keys in the image layers.Choosing the right strategy depends on your project's complexity, team workflow, model size, and deployment environment. For many applications, copying the model directly offers simplicity, while downloading during the build provides better decoupling for more complex or frequently updated models managed in separate storage.
© 2025 ApX Machine Learning