ML-Py-Stevedore

What is it?

ML-Py-Stevedore is a ML-framework agnostic wrapper class and generic API plus Podman/Colima/Docker build automation for Python ML models. A specific use case is a set of machine learning models that may be composed, so that they can be tested and packaged together. This by no means excludes use on single model.

Why is it necessary?

ML-Py-Stevedore offers a reusable ML REST API design plus surrounding services for fast containerization of Python ML models. It offers useful premeditated standard services, naturally presented requirements for the functionality of the model, and build & test automation.

How does it work?

The key technologies are HTTP, Containers, JSON, Python, Make, FastAPI and JSON Schema.  

The assumptions are 

  • Request payload is presented as JSON in the HTTP request body.
  • For each model, the user supplies
  • a JSON schema for prediction and scoring inputs,
  • a couple of key methods that are missing in the Predictor base class, and
  • test inputs for predict and score to honestly test the model availability. 

As the HTTP-framework that serves the API is FastAPI, a Swagger documentation of the API is also served up to the user-defined JSON schema. 

The user derives a custom class from the Predictor class to wrap the model, e.g.: 

 class LogReg(Predictor): 

    def convert_prediction_input(self, x): 

        return np.asfarray(x)

    def convert_score_input(self, in_object): 

        X = [item["x"] for item in in_object] 

        y = [item["y"] for item in in_object] 

        return np.asfarray(X), np.array(y)

    def run_scores(self, X, y): 

        return self.model.score(X, y)

    def run_predict(self, X): 

        return self.model.predict(X)

    def convert_output(self, res): 

        return res.tolist()

The services of the API enable monitoring of the models, queries to support their use and calls to the model prediction. The services are 

  • health services such as
    • livez
    • healthz
    • readyz
    • score
    • variants of the above 
  • catalogues
    • list the models being served
    • version of each model
    • created tells the creation time
    • predict_schema and score_schema return the JSON schemas for prediction and scoring input 
  • predict - of course.