Step-by-Step Guide to Deploying Machine Learning Models with FastAPI and Docker
Image by Editor | Midjourney
You’ve trained your machine learning model, and it’s performing great on test data. But here’s the truth: a model sitting in a Jupyter notebook isn’t helping anyone. It’s only when you deploy it to production real users can benefit from your work.
In this article we’re building a diabetes progression predictor on a sample dataset from scikit-learn. We’ll take it from raw data all the way to a containerized API that’s ready for the cloud.
By coding along to this tutorial, you’ll have:
- A trained Random Forest model that predicts diabetes progression scores
- A REST API, built using FastAPI, that accepts patient data and returns predictions
- A fully containerized application ready for deployment
Let’s get started.
Setting Up Your Development Environment
Before we start coding, let’s get your dev environment ready. You’ll need:
- Python 3.11+ (though 3.9+ works fine, too)
- Docker installed and running
- Basic familiarity with Python and APIs (I’ll explain the non-trivial parts)
Project Structure
Here’s how we’ll organize everything in the project directory:
diabetes–predictor/ │ ├── app/ │ ├── __init__.py │ └── main.py # FastAPI application │ ├── models/ │ └── diabetes_model.pkl # Trained model (we’ll create this) │ ├── train_model.py # Model training script ├── requirements.txt # Python dependencies └── Dockerfile # Container configuration |
Installing Dependencies
Let’s create a clean virtual environment:
$ python –m venv diabetes–env $ source diabetes–env/bin/activate # Windows: diabetes-env\Scripts\activate |
Now install the required libraries:
$ pip install scikit–learn pandas fastapi uvicorn |
Building a Machine Learning Model for Predicting Diabetes Progression
Let’s start by creating our machine learning model. Create train_model.py:
# train_model.py from sklearn.datasets import load_diabetes from sklearn.model_selection import train_test_split from sklearn.ensemble import RandomForestRegressor from sklearn.metrics import mean_squared_error, r2_score import pickle import os |
We’ve chosen Random Forest because it’s robust, handles different feature scales well, and gives us feature importance insights as well.
Let’s load and explore our diabetes dataset:
# Load the diabetes dataset diabetes = load_diabetes() X, y = diabetes.data, diabetes.target
print(f“Dataset shape: {X.shape}”) print(f“Features: {diabetes.feature_names}”) print(f“Target range: {y.min():.1f} to {y.max():.1f}”) |
The diabetes dataset is a collection of 442 patient records with 10 physiological features. The target is a quantitative measure of disease progression one year after baseline: higher numbers indicate more advanced progression.
Output:
Dataset shape: (442, 10) Features: [‘age’, ‘sex’, ‘bmi’, ‘bp’, ‘s1’, ‘s2’, ‘s3’, ‘s4’, ‘s5’, ‘s6’] Target range: 25.0 to 346.0 |
Now let’s prepare our data:
# Split the data X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.2, random_state=42 )
print(f“Training samples: {X_train.shape[0]}”) print(f“Test samples: {X_test.shape[0]}”) |
The 80/20 split gives us enough training data while reserving a solid test set. Using random_state=42
ensures reproducible results.
Output:
Training samples: 353 Test samples: 89 |
Time to train our model:
# Train Random Forest model model = RandomForestRegressor( n_estimators=100, random_state=42, max_depth=10 )
model.fit(X_train, y_train) |
We’ve set max_depth=10
to prevent overfitting on this relatively small dataset. With 100 trees, we get good performance without excessive computation time.
Let’s evaluate our model:
# Make predictions and evaluate y_pred = model.predict(X_test)
mse = mean_squared_error(y_test, y_pred) r2 = r2_score(y_test, y_pred)
print(f“Mean Squared Error: {mse:.2f}”) print(f“R² Score: {r2:.3f}”) |
The R² score tells us what percentage of variance in disease progression our model explains. Anything above 0.4 is pretty good for this dataset!
Output:
Mean Squared Error: 2974.20 R² Score: 0.439 |
Finally, let’s save our trained model:
# Create models directory and save model os.makedirs(‘models’, exist_ok=True)
with open(‘models/diabetes_model.pkl’, ‘wb’) as f: pickle.dump(model, f)
print(“Model trained and saved successfully!”) |
Run this script to train your model:
You should see output showing your model’s performance and confirmation that it’s been saved.
Creating the FastAPI Application
Now for the exciting part: turning our model into a web API.
If you haven’t already, create the app
directory and an empty __init__.py
file:
$ mkdir app $ touch app/__init__.py |
Now create app/main.py
with our API code:
# app/main.py from fastapi import FastAPI from pydantic import BaseModel import pickle import numpy as np import os |
FastAPI uses Pydantic for request validation. Meaning it automatically validates incoming data and provides clear error messages if something’s wrong.
Let’s define our input data structure:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
# Define input data schema class PatientData(BaseModel): age: float sex: float bmi: float bp: float # blood pressure s1: float # serum measurement 1 s2: float # serum measurement 2 s3: float # serum measurement 3 s4: float # serum measurement 4 s5: float # serum measurement 5 s6: float # serum measurement 6
class Config: schema_extra = { “example”: { “age”: 0.05, “sex”: 0.05, “bmi”: 0.06, “bp”: 0.02, “s1”: –0.04, “s2”: –0.04, “s3”: –0.02, “s4”: –0.01, “s5”: 0.01, “s6”: 0.02 } } |
The example values help API users understand the expected input format. Note that the diabetes dataset features are already normalized.
Next, we initialize FastAPI app and load the model into the FastAPI environment:
# Initialize FastAPI app app = FastAPI( title=“Diabetes Progression Predictor”, description=“Predicts diabetes progression score from physiological features”, version=“1.0.0” )
# Load the trained model model_path = os.path.join(“models”, “diabetes_model.pkl”) with open(model_path, ‘rb’) as f: model = pickle.load(f) |
Finally, let’s create our prediction endpoint:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
@app.post(“/predict”) def predict_progression(patient: PatientData): “”“ Predict diabetes progression score ““” # Convert input to numpy array features = np.array([[ patient.age, patient.sex, patient.bmi, patient.bp, patient.s1, patient.s2, patient.s3, patient.s4, patient.s5, patient.s6 ]])
# Make prediction prediction = model.predict(features)[0]
# Return result with additional context return { “predicted_progression_score”: round(prediction, 2), “interpretation”: get_interpretation(prediction) }
def get_interpretation(score): “”“Provide human-readable interpretation of the score”“” if score < 100: return “Below average progression” elif score < 150: return “Average progression” else: return “Above average progression” |
The interpretation function helps make our API more user-friendly by providing context for the numerical predictions.
Let’s also add a health check endpoint:
@app.get(“/”) def health_check(): return {“status”: “healthy”, “model”: “diabetes_progression_v1”} |
Testing the API Locally
Before containerizing, let’s test our API locally. Run the following command from your project’s root directory:
$ uvicorn app.main:app —reload —port 8000 |
Open your browser to http://localhost:8000/
and you’ll see the FastAPI app running. Try making a prediction with the example data.
You can also test with curl:
curl –X POST “http://localhost:8000/predict” \ –H “Content-Type: application/json” \ –d ‘{ “age”: 0.05, “sex”: 0.05, “bmi”: 0.06, “bp”: 0.02, “s1”: -0.04, “s2”: -0.04, “s3”: -0.02, “s4”: -0.01, “s5”: 0.01, “s6”: 0.02 }’ |
This should give you the following result:
{“predicted_progression_score”:213.34, “interpretation”:“Above average progression”} |
Containerizing with Docker
Now let’s package everything into a Docker container. First, create requirements.txt:
fastapi==0.115.12 uvicorn==0.34.2 scikit–learn==1.6.1 pandas==2.2.3 numpy==2.2.6 |
We’ve pinned specific versions to ensure consistency across environments.
Now create the Dockerfile:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
# Use Python 3.11 slim image FROM python:3.11–slim
# Set working directory WORKDIR /app
# Install system dependencies (if needed) RUN apt–get update && apt–get install –y \ && rm –rf /var/lib/apt/lists/*
# Copy requirements and install Python dependencies COPY requirements.txt . RUN pip install —no–cache–dir –r requirements.txt
# Copy application code COPY app/ ./app/ COPY models/ ./models/
# Expose port EXPOSE 8000
# Run the application CMD [“uvicorn”, “app.main:app”, “–host”, “0.0.0.0”, “–port”, “8000”] |
The slim image keeps our container small, and --no-cache-dir
prevents pip from storing cached packages, further reducing size.
Build your Docker image:
$ docker build –t diabetes–predictor . |
Run the container:
$ docker run –d –p 8000:8000 diabetes–predictor |
Your API is now running in a container! Test it the same way as before.
Publishing to Docker Hub
Now that your containerized API is working locally, let’s share it with the world through Docker Hub. This step is necessary for cloud deployment. Most cloud platforms can pull directly from Docker Hub, making deployment seamless.
Setting Up Docker Hub
First, you’ll need a Docker Hub account if you don’t have one:
- Go to hub.docker.com and sign up
- Choose a username you’re happy with. It’ll be part of your image URLs
Logging Into Docker Hub
From your terminal, log into Docker Hub:
You’ll be prompted for your Docker Hub username and password. Enter them carefully. This creates an authentication token that lets you push images.
Tagging Your Image
Before pushing, we need to tag our image with your Docker Hub username. Docker uses a specific naming convention:
$ docker tag diabetes–predictor your–username/diabetes–predictor:v1.0 |
Replace your-username
with your actual Docker Hub username. The v1.0
is a version tag. It’s good practice to version your images so you can track changes and roll back if needed.
Let’s also create a latest tag, which many deployment platforms use by default:
$ docker tag diabetes–predictor your–username/diabetes–predictor:latest |
Check your tagged images:
$ docker images | grep diabetes–predictor |
You should see three entries: your original image and the two newly tagged versions.
Pushing to Docker Hub
Now let’s push your image to Docker Hub:
$ docker push your–username/diabetes–predictor:v1.0 $ docker push your–username/diabetes–predictor:latest |
The first push might take a few minutes as Docker uploads all the layers. Subsequent pushes should be substantially faster.
You can verify everything works by pulling and running your published image:
# Stop your local container first $ docker stop $(docker ps –q —filter ancestor=diabetes–predictor)
# Pull and run from Docker Hub $ docker run –d –p 8000:8000 your–username/diabetes–predictor:latest |
Test the API again to make sure everything still works. If it does, your model is now publicly available and ready for cloud deployment.
Wrapping Up
Congratulations! You’ve just built a complete machine learning deployment pipeline:
- Trained a robust Random Forest model on medical data
- Created a working REST API with FastAPI
- Containerized the application with Docker
Your model is now ready for cloud deployment! You could deploy this to AWS ECS, Fargate, Google Cloud, or Azure.
Want to take it further? You can consider adding the following:
- Authentication and rate limiting
- Model monitoring and logging
- Batch prediction endpoints
You now have all the basics to deploy any machine learning model to production. Happy coding!