MLOps and monitoring platform for high number of ML models

What is it?


MLOps and monitoring platform of ML application based on high number of models. The MLOps platform is made to handle large number of models to automate training, retraining and serving those models. The model monitoring dashboard is customized to the use case/business perspective, as it tracks multiple models per building for application, instead of just specific models and it also tracks data quality and infrastructure serving the application. 

Why is it necessary?

Each building has its own unique characteristics, and the relationship between weather, energy consumption, and different types of consumption varies for each one. To address this, we use lightweight, trained models tailored to each building, which are continually updated to reflect changing dynamics. Our platform automates the creation, training, and retraining of these models for each building. Additionally, our monitoring dashboard identifies potential problem sources, whether they stem from the data, model, cloud infrastructure, or the building itself.

How does it work?

Our MLOps platform, capable of managing a large number of models for a single application, leverages Azure components such as Container Apps, Functions, Monitor, Data Lakes, and SQL Server. It also integrates with MLflow for lifecycle management and Grafana for performance monitoring dashboard. The platform handles the entire machine learning lifecycle, from task submission via Training APIs and job scheduling with Hangfire to orderly processing through job queues. It retrieves necessary scripts and data for training, integrates with various data resources, and serves external requests efficiently, ensuring optimal functionality and scalability. 

This image has no alt text.
Granlund’s MLOps dashboard – Building Information Detail Granlund
This image has no alt text.
Granlund's MLOps platform - Training status monitoring Granlund