Production-Ready, Reproducible, Secure, Cross-Cloud Machine Learning Pipeline

Engineer: Michael Maio
Last updated: 9/7/2025

Overview

This repo contains a working machine learning pipeline that addresses the following hypothetical scenario that a software company might need to deal with.

Problem: Data center power consumption is growing, causing a gradual, year-over-year increase in the hourly peak kilowatt load reported by a sensor on the local transformer.

Question: Assuming the power consumption trend remains unchanged, how long before the transformer becomes overloaded?

Solution: Build a machine learning pipeline that can process recent trend data and forecast when the transformer may eventually become overloaded, informing the necessary schedule for a preventative upgrade.

This machine learning pipeline uses:

Python for the scripting.
Docker containers to encapsulate training, promotion, and prediction jobs.
MLflow for model management.
YAML for job management.
GitHub Actions to trigger a pipeline deployment.
Terraform to create and update AzureML infrastructure from code.
A managed identity to keep everything secure.

#1 through #3 allow for the entire pipeline to be run locally for quick feedback on changes before deploying to the cloud. No Azure required. #6 allows for other cloud providers, such as AWS or GCP, to be swapped in as needed.

AzureML In Action

This is the starting point: a high-level view of the experiment in Azure AI’s Machine Learning Studio.

Drilling into the experiment shows a list of its jobs. Each job represents a different deployment of the pipeline that trains the model, promotes the model (if it passed testing), and uses the model to make predictions.

Drilling into the latest job reveals a list of sub-jobs and how they are wired together. Below you can see that the sub-job which trains the model outputs a “trained_model” to the job that promotes the model, which then outputs a “promoted_model” to the job that uses the model to output “predictions”.

You can drill into each sub-job to view all kinds of details about it. Below you can see that the first sub-job, “Train Transformer Load Model”, did the following:

It output the model; once when MLflow logged the model and once to pass the model along to the promotion job.
It applied some informative tags.
It reported the metric “rmse” (aka Root Mean Squared Error), indicating how well the model performed during testing.

You can drill into one of the model links to get more information on the model.

And drill into its artifacts.

Moving on to the “Promote Transformer Load Model” sub-job, you can see that it output the “promoted_model”, meaning the model passed testing during training and the Root Mean Squared Error of the model was sufficiently low for it to be useful in making predictions.

If you view the AzureML model registry for the workspace, you can see that the promotion sub-job registered the model since it passed testing.

Moving onto the “Predict Transformer Overload” sub-job, we can see that it created the following: 1. A tag reporting that the transformer is predicted to hit its first overload at 11pm on November 26th, 2027. 2. A metric predicting that the maximum load over the entire 5-year period will be about 98 kilowatts. 3. A metric predicting that the transformer will overload over 4,623 times in the next 5 years given current usage trends.

You can also drill into the “predictions” output and see the files that the prediction job uploaded, including: 1. The predicted transformer load in kilowatts for each hour during the next 5 years. 2. The number of times the transformer is predicted to overload during that period. 3. A chart of the predicted loads.

Name		Name	Last commit message	Last commit date
Latest commit History 173 Commits
.github/workflows		.github/workflows
.vscode		.vscode
data		data
outputs		outputs
src		src
terraform-ml-deploy		terraform-ml-deploy
.amlignore		.amlignore
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
local_run_log.txt		local_run_log.txt
run_pipeline.sh		run_pipeline.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Production-Ready, Reproducible, Secure, Cross-Cloud Machine Learning Pipeline

Overview

AzureML In Action

About

Uh oh!

Releases

Packages

Languages

MichaelMaio/ml_time_series_forecasting

Folders and files

Latest commit

History

Repository files navigation

Production-Ready, Reproducible, Secure, Cross-Cloud Machine Learning Pipeline

Overview

AzureML In Action

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages