- No experience with Open Source and customization of solutions “for themselves”;
- A large and complex project that needs vendor support;
- There is a budget for licensed software.
- It is necessary to adapt the solution to the needs of the company fully;
- need access to as many Best Practices as possible;
- You need to eliminate Vendor Lock-in-risks and be able to change the solution provider at any time;
- I want to use the most relevant features, which often appear faster in Open Source.
What Tools Exist
Open-Source solutions close the entire stack of ML tasks in various stages:
- collection, preparation, and labeling of data;
- model training;
- assessment of the quality of the model;
- bringing the model to production;
There are more than 50 MLOps tools in total. The MyMLOps team even created a constructor with which you can build your own MLOps stack from Open-Source solutions.
Next, we will talk about a few complex solutions.
Originally developed by Netflix, the service opened in 2019 to help data scientists and machine learning engineers manage ML projects. The system is Python-friendly and supports R language.
Metaflow can be used in various machine-learning projects. The service automatically tracks and versions experiments and data. In addition, the following features are built into the platform:
- managing external dependencies;
- management of computing resources;
- reproducing and resuming the execution of the workflow;
- switching between local and remote execution modes;
- execution of container launches.
Metaflow is a layer between a data scientist and Kubernetes and infrastructure. An engineer can use Metaflow as a Python library to describe the steps for working with an ML model as a DAG.
- ML flow
The service helps to manage the main stages of the life cycle of machine learning. It is typically used for tracking experiments, but it is also suitable for reproducing, deploying, and maintaining a registry of models.
MLflow has four main components:
- MLflow Tracking – storage and access to code, data, configurations and results;
- MLflow Projects – containerization and deployment of models;
- MLflow Models – Deployment and management of ML models in various service environments;
- MLflow Model Registry is a central model repository providing versioning, stage transitions, annotations, and machine learning model management.
The platform can be integrated with machine learning libraries, including TensorFlow and PyTorch. And manage experiments and metadata using CLI, Python, R, Java and REST API.
Kubeflow makes it easy for developers to organize and deploy ML workflows. With it, you can easily deploy ML models in Kubernetes, and it can also be used in the following tasks:
- data preparation;
- training and optimization of the ML model;
- service of forecasts;
- control the performance of the model in production.
The solution supports JupyterLab, RStudio, and Visual Studio Code. Hyperparameters and neural architecture search can be tuned. One of the advantages of Kubeflow is that it has a UI for processing and tracking experiments, tasks, and runs. In addition, a built-in engine for organizing multi-stage machine learning workflows.
Despite the wide functionality of Kubeflow, a 2022 survey showed that teams use only its components, not the entire service, due to its complexity. In addition, when implementing, many need more documentation and training materials, and they must understand the tool independently.