Filter options

Publication Date
From
to
Subjects
Journals
Article Types
Countries / Territories
Open Access December 27, 2023

MLOps Frameworks for Reliable Model Deployment in Cloud Data Platforms

Abstract Machine learning operations (MLOps) comprises the practices, methods, and tooling that facilitate the deployment of reliable ML models in production environments. While many aspects of cloud data platforms are designed to enable reliability, only some managed ML services support the MLOps goals of continuous integration, continuous delivery, data lineage tracking, associated reproducibility, [...] Read more.
Machine learning operations (MLOps) comprises the practices, methods, and tooling that facilitate the deployment of reliable ML models in production environments. While many aspects of cloud data platforms are designed to enable reliability, only some managed ML services support the MLOps goals of continuous integration, continuous delivery, data lineage tracking, associated reproducibility, governance, and security. Furthermore, reliability encompasses not only the fulfillment of service-level objectives, but also systematic monitoring, alerting, and incident response automation. Architectural patterns are proposed to enable reliable deployment in cloud data platforms, focusing on the implementation of continuous integration and testing pipelines for ML models and the formulation of continuous delivery and rollout strategies. Continuous integration pipelines reduce the risk of regressions and ensure sufficient model performance at the time of deployment, while continuous delivery pipelines enable rapid updates to production models within acceptable risk profiles. The landscape of publicly available MLOps frameworks, tools, and services is also examined, emphasizing the pros and cons of established and rising solutions in containerization, orchestration, model serving, and inference. Containerization and orchestration contributes to the building of reliable deployment pipelines in cloud data platforms, whether general-purpose tools (e.g. Docker and Kubernetes) or solutions tailored for ML workloads. Containerized serving frameworks designed for high-throughput, low-latency inference can benefit a wide range of business applications, while auto-scaling and model versioning capabilities enhance the ease of use of cloud-native ML services.
Figures
PreviousNext
Review Article

Query parameters

Keyword:  Data Lineage and Reproducibility

View options

Citations of

Views of

Downloads of