MLOps Testing Methodology
What is it?
The IML4E MLOps testing methodology aims to provide a schema to systematically apply testing to MLOps processes and thus increase the quality of ML-based application by maintaining efficiency through targeted testing. It is a comprehensive framework that incorporates testing in all phases during developing, integrating, and operating ML-based systems by combining classical software engineering with data science activities to ensure the quality and reliability of ML-based systems.
Why is it necessary?
The IML4E MLOps testing methodology is a valuable tool for systematically address the quality assurance challenges specific to ML systems, ensuring that both the software and the ML components meet the required performance, reliability, and compliance standards throughout their lifecycle.
How does it work?
The methodology divides the development lifecycle into several phases and associates items to be tested, acceptance criteria and test method to each of the phases. The phases are:
- Business Understanding and Inception: Identifying objectives, requirements, and understanding the data context.
- Experimentation and Training Pipeline Development: Evaluating data and modeling approaches, building PoC systems, and developing the training pipeline.
- Training: Creating and validating models using the developed pipeline.
- System Development and Integration: Integrating the ML model into the operational software environment.
- Operation and Monitoring: Monitoring the system in its operational environment to ensure ongoing performance and compliance.
Besides the major outcomes of each phase, the methodology identifies additional test items that each contributes to the quality of the overall system. Each phase includes specific testing activities, such as data, model and component testing referring to different test methods and kinds of testing such as dynamic and static testing, monitoring, and reviews, to ensure that at the end the system meets the defined criteria and maintains its quality throughout its lifecycle.