Table of Contents
- Prerequisites
- Setting Up Your Environment
- Creating a Django Project: Foundations
- Integrating Data Science Models into Django
- Building User Interfaces for Data Input/Output
- Adding Data Visualizations
- Advanced Features: Asynchronous Tasks and APIs
- Deployment: Taking Your App Live
- Best Practices
- Case Study: House Price Prediction App
- References
Prerequisites
Before diving in, ensure you have:
- Basic knowledge of Python (syntax, functions, modules).
- Familiarity with Django basics (models, views, templates, URLs).
- Understanding of data science workflows (model training, evaluation).
- Python 3.8+ installed.
- A code editor (VS Code, PyCharm) and terminal.
Setting Up Your Environment
Step 1: Install Python and Virtual Environment
First, set up a virtual environment to isolate dependencies:
# Create a project folder
mkdir django-data-science-app && cd django-data-science-app
# Create and activate a virtual environment
python -m venv venv
source venv/bin/activate # Linux/macOS
venv\Scripts\activate # Windows
# Verify activation (terminal prompt shows `(venv)`)
Step 2: Install Dependencies
Install Django and key data science libraries:
pip install django pandas scikit-learn joblib matplotlib plotly
django: Web framework.pandas: Data manipulation.scikit-learn: Machine learning models.joblib: Serialize/deserialize models.matplotlib/plotly: Data visualization.
Creating a Django Project: Foundations
Let’s start by creating a Django project and app. We’ll use a modular structure to separate web logic from data science code.
Step 1: Start a Django Project
django-admin startproject core . # Creates a project named `core` in the current directory
Step 2: Create a Data Science App
Django uses “apps” to organize code. We’ll create an app named predictor for our data science logic:
python manage.py startapp predictor
Step 3: Configure Settings
Add predictor to INSTALLED_APPS in core/settings.py:
# core/settings.py
INSTALLED_APPS = [
"django.contrib.admin",
"django.contrib.auth",
"django.contrib.contenttypes",
"django.contrib.sessions",
"django.contrib.messages",
"django.contrib.staticfiles",
"predictor", # Add your app here
]
Step 4: Define URLs
Map URLs to views. First, update core/urls.py to include the predictor app’s URLs:
# core/urls.py
from django.contrib import admin
from django.urls import path, include
urlpatterns = [
path("admin/", admin.site.urls),
path("", include("predictor.urls")), # Route root URLs to `predictor`
]
Create a urls.py file in the predictor app:
# predictor/urls.py
from django.urls import path
from . import views
urlpatterns = [
path("", views.home, name="home"), # Home page
path("predict/", views.predict, name="predict"), # Prediction endpoint
]
Integrating Data Science Models into Django
The core of your app will be a trained machine learning model. Here’s how to integrate it into Django.
Step 1: Train and Save a Model
First, train a simple model (e.g., a linear regression model for house price prediction) and save it using joblib.
Create a models directory in predictor to store model-related code:
mkdir -p predictor/models
Add a script train_model.py to train and save the model:
# predictor/models/train_model.py
import pandas as pd
from sklearn.datasets import fetch_california_housing
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
import joblib
# Load dataset (California Housing Prices)
housing = fetch_california_housing()
X = pd.DataFrame(housing.data, columns=housing.feature_names)
y = housing.target # Median house value (in $100k)
# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train model
model = LinearRegression()
model.fit(X_train, y_train)
# Save model to disk
joblib.dump(model, "predictor/models/house_price_model.joblib")
print("Model saved!")
Run the script to train and save the model:
python predictor/models/train_model.py
Step 2: Load the Model in Django
To avoid reloading the model on every request (which is inefficient), load it once when the Django app starts. Create a predictor_service.py to handle model loading and predictions:
# predictor/models/predictor_service.py
import joblib
import numpy as np
from django.conf import settings
import os
# Path to the saved model
MODEL_PATH = os.path.join(settings.BASE_DIR, "predictor/models/house_price_model.joblib")
# Global variable to cache the model
model = None
def load_model():
"""Load the model once and reuse it."""
global model
if model is None:
model = joblib.load(MODEL_PATH)
return model
def predict_price(features):
"""Predict house price using the loaded model."""
model = load_model()
features_array = np.array(features).reshape(1, -1) # Reshape for single sample
prediction = model.predict(features_array)
return round(prediction[0] * 100000, 2) # Convert from $100k to $
Building User Interfaces for Data Input/Output
Users need a way to input data (e.g., house features) and view predictions. We’ll use Django forms and templates for this.
Step 1: Create a Form for Input
Django forms simplify data validation. Create forms.py in the predictor app:
# predictor/forms.py
from django import forms
class HousePriceForm(forms.Form):
# Fields match the California Housing dataset features
MedInc = forms.FloatField(
label="Median Income (in $10k)",
min_value=0,
widget=forms.NumberInput(attrs={"class": "form-control"})
)
HouseAge = forms.FloatField(
label="Median House Age (years)",
min_value=0,
max_value=100,
widget=forms.NumberInput(attrs={"class": "form-control"})
)
AveRooms = forms.FloatField(
label="Average Rooms per Household",
min_value=0,
widget=forms.NumberInput(attrs={"class": "form-control"})
)
AveBedrms = forms.FloatField(
label="Average Bedrooms per Household",
min_value=0,
widget=forms.NumberInput(attrs={"class": "form-control"})
)
Population = forms.FloatField(
label="Block Group Population",
min_value=0,
widget=forms.NumberInput(attrs={"class": "form-control"})
)
AveOccup = forms.FloatField(
label="Average Occupants per Household",
min_value=0,
widget=forms.NumberInput(attrs={"class": "form-control"})
)
Latitude = forms.FloatField(
label="Latitude",
min_value=32,
max_value=42,
widget=forms.NumberInput(attrs={"class": "form-control"})
)
Longitude = forms.FloatField(
label="Longitude",
min_value=-125,
max_value=-114,
widget=forms.NumberInput(attrs={"class": "form-control"})
)
Step 2: Create a View to Handle Predictions
Views process requests, interact with the model, and render templates. Update views.py:
# predictor/views.py
from django.shortcuts import render
from .forms import HousePriceForm
from .models.predictor_service import predict_price
def home(request):
return render(request, "predictor/home.html")
def predict(request):
if request.method == "POST":
form = HousePriceForm(request.POST)
if form.is_valid():
# Extract cleaned data from the form
data = form.cleaned_data
features = [
data["MedInc"],
data["HouseAge"],
data["AveRooms"],
data["AveBedrms"],
data["Population"],
data["AveOccup"],
data["Latitude"],
data["Longitude"],
]
price = predict_price(features) # Get prediction
return render(request, "predictor/predict.html", {"form": form, "price": price})
else:
form = HousePriceForm() # Empty form for GET request
return render(request, "predictor/predict.html", {"form": form})
Step 3: Create Templates for the UI
Templates define the HTML structure. Create a templates/predictor directory and add two files:
home.html (Landing Page):
<!-- predictor/templates/predictor/home.html -->
<!DOCTYPE html>
<html>
<head>
<title>House Price Predictor</title>
<link href="https://cdn.jsdelivr.net/npm/[email protected]/dist/css/bootstrap.min.css" rel="stylesheet">
</head>
<body>
<div class="container mt-5">
<h1>Welcome to House Price Predictor</h1>
<p class="lead">Enter house features below to get a price estimate.</p>
<a href="{% url 'predict' %}" class="btn btn-primary">Go to Predictor</a>
</div>
</body>
</html>
predict.html (Prediction Form/Results):
<!-- predictor/templates/predictor/predict.html -->
<!DOCTYPE html>
<html>
<head>
<title>House Price Predictor</title>
<link href="https://cdn.jsdelivr.net/npm/[email protected]/dist/css/bootstrap.min.css" rel="stylesheet">
</head>
<body>
<div class="container mt-5">
<h2>House Price Prediction</h2>
<form method="post" class="mt-3">
{% csrf_token %} <!-- Security token -->
{{ form.as_p }} <!-- Render form fields -->
<button type="submit" class="btn btn-success">Predict Price</button>
</form>
{% if price %} <!-- Display prediction if available -->
<div class="alert alert-info mt-4">
<h3>Predicted House Price: ${{ price }}</h3>
</div>
{% endif %}
</div>
</body>
</html>
Adding Data Visualizations
Visualizations make results more interpretable. Let’s add a bar chart comparing the predicted price to regional averages using Matplotlib.
Step 1: Generate a Plot in the View
Update the predict view to generate a plot when a prediction is made:
# predictor/views.py
import matplotlib.pyplot as plt
import os
from django.conf import settings
from django.templatetags.static import static
def predict(request):
if request.method == "POST":
form = HousePriceForm(request.POST)
if form.is_valid():
# ... (previous code to get features and price)
# Generate visualization
plt.switch_backend("Agg") # Required for non-interactive environments
fig, ax = plt.subplots(figsize=(8, 5))
labels = ["Predicted Price", "Regional Average ($500k)"]
values = [price, 500000]
ax.bar(labels, values, color=["blue", "orange"])
ax.set_ylabel("Price ($)")
ax.set_title("Predicted vs. Regional Average Price")
# Save plot to static files
static_dir = os.path.join(settings.BASE_DIR, "predictor/static/predictor/plots/")
os.makedirs(static_dir, exist_ok=True)
plot_path = os.path.join(static_dir, "price_comparison.png")
plt.savefig(plot_path)
plt.close()
# Pass plot URL to template
plot_url = static("predictor/plots/price_comparison.png")
return render(request, "predictor/predict.html", {
"form": form,
"price": price,
"plot_url": plot_url
})
# ... (rest of the view)
Step 2: Update Settings for Static Files
Django requires static file configuration to serve plots. Update core/settings.py:
# core/settings.py
STATIC_URL = "/static/"
STATIC_ROOT = os.path.join(settings.BASE_DIR, "staticfiles")
STATICFILES_DIRS = [os.path.join(settings.BASE_DIR, "predictor/static")]
Step 3: Display the Plot in the Template
Update predict.html to include the plot:
<!-- Add this below the prediction alert -->
{% if plot_url %}
<div class="mt-4">
<h4>Price Comparison</h4>
<img src="{{ plot_url }}" alt="Price Comparison Plot" class="img-fluid">
</div>
{% endif %}
Advanced Features: Asynchronous Tasks and APIs
For compute-heavy models (e.g., deep learning), use Celery for asynchronous task processing to avoid blocking the web server. For programmatic access, add a REST API with Django REST Framework (DRF).
Example: Async Prediction with Celery
Install Celery and Redis (broker):
pip install celery redis
Define a Celery task to handle predictions:
# predictor/tasks.py
from celery import shared_task
from .models.predictor_service import predict_price
@shared_task
def async_predict_price(features):
return predict_price(features)
Update the view to use the async task (requires Celery setup, beyond this guide’s scope).
Example: REST API with DRF
Install DRF:
pip install djangorestframework
Create a serializer and viewset:
# predictor/serializers.py
from rest_framework import serializers
class HousePriceSerializer(serializers.Serializer):
MedInc = serializers.FloatField()
HouseAge = serializers.FloatField()
# ... (other fields)
# predictor/views.py
from rest_framework.decorators import api_view
from rest_framework.response import Response
@api_view(["POST"])
def api_predict(request):
serializer = HousePriceSerializer(data=request.data)
if serializer.is_valid():
features = list(serializer.validated_data.values())
price = predict_price(features)
return Response({"predicted_price": price})
return Response(serializer.errors, status=400)
Deployment: Taking Your App Live
To share your app, deploy it to a cloud platform like Heroku or AWS. Here’s a quick Heroku deployment guide:
Step 1: Prepare Deployment Files
Procfile: Specifies the web server.web: gunicorn core.wsgi --log-file -requirements.txt: Lists dependencies.pip freeze > requirements.txtruntime.txt: Specifies Python version.python-3.9.7
Step 2: Deploy to Heroku
heroku create my-ds-app
git add . && git commit -m "Initial deploy"
git push heroku main
heroku run python manage.py migrate
Best Practices
- Separation of Concerns: Keep data science code in a
models/orml/subdirectory, not in views. - Model Versioning: Use tools like DVC or MLflow to track model versions.
- Testing: Write unit tests for models (e.g.,
test_predict_price()) and Django views. - Security: Sanitize input, use HTTPS, and encrypt sensitive data.
- Performance: Cache predictions with Django’s
cache_pageor Redis for repeated inputs.
Case Study: House Price Prediction App
We’ve built a complete app that:
- Trains a linear regression model on housing data.
- Lets users input house features via a form.
- Returns a predicted price and a comparison plot.
To run it locally:
python manage.py runserver
Visit http://localhost:8000 to use the app!
References
- Django Documentation
- Scikit-learn Documentation
- Django for Data Scientists (Book)
- Heroku Django Deployment Guide
- Celery Documentation
By combining Django and Python’s data science tools, you can build powerful, user-centric applications that bridge the gap between models and end-users. Start small, iterate, and scale—your data science app is just a few lines of code away!