Filter options

Publication Date
From
to
Subjects
Journals
Article Types
Countries / Territories
Open Access November 24, 2022

Bridging Traditional ETL Pipelines with AI Enhanced Data Workflows: Foundations of Intelligent Automation in Data Engineering

Abstract Machine Learning (ML) and Artificial Intelligence (AI) are having an increasingly transformative impact on all industries and are already used in many mission-critical use cases in production, bringing considerable value. Data engineering, which combines ETL pipelines with other workflows managing data and machine learning operations, is also significantly impacted. The Intelligent Data [...] Read more.
Machine Learning (ML) and Artificial Intelligence (AI) are having an increasingly transformative impact on all industries and are already used in many mission-critical use cases in production, bringing considerable value. Data engineering, which combines ETL pipelines with other workflows managing data and machine learning operations, is also significantly impacted. The Intelligent Data Engineering and Automation framework offers the groundwork for intelligent automation processes. However, ML/AI are not the only disruptive forces; new Big Data technologies inspired by Web2.0 companies are also reshaping the Internet. Companies having the largest Big Data footprints not only provide applications with a Big Data operational model but also source their competitive advantage from data in the form of AI services and, consequently, impact the cost/performance equilibrium of ETL pipelines. All these technologies and reasons help explain why the traditional ETL pipeline design should adapt to current and emerging technologies and may be enhanced through artificial intelligence.
Figures
PreviousNext
Article
Open Access December 27, 2023

MLOps Frameworks for Reliable Model Deployment in Cloud Data Platforms

Abstract Machine learning operations (MLOps) comprises the practices, methods, and tooling that facilitate the deployment of reliable ML models in production environments. While many aspects of cloud data platforms are designed to enable reliability, only some managed ML services support the MLOps goals of continuous integration, continuous delivery, data lineage tracking, associated reproducibility, [...] Read more.
Machine learning operations (MLOps) comprises the practices, methods, and tooling that facilitate the deployment of reliable ML models in production environments. While many aspects of cloud data platforms are designed to enable reliability, only some managed ML services support the MLOps goals of continuous integration, continuous delivery, data lineage tracking, associated reproducibility, governance, and security. Furthermore, reliability encompasses not only the fulfillment of service-level objectives, but also systematic monitoring, alerting, and incident response automation. Architectural patterns are proposed to enable reliable deployment in cloud data platforms, focusing on the implementation of continuous integration and testing pipelines for ML models and the formulation of continuous delivery and rollout strategies. Continuous integration pipelines reduce the risk of regressions and ensure sufficient model performance at the time of deployment, while continuous delivery pipelines enable rapid updates to production models within acceptable risk profiles. The landscape of publicly available MLOps frameworks, tools, and services is also examined, emphasizing the pros and cons of established and rising solutions in containerization, orchestration, model serving, and inference. Containerization and orchestration contributes to the building of reliable deployment pipelines in cloud data platforms, whether general-purpose tools (e.g. Docker and Kubernetes) or solutions tailored for ML workloads. Containerized serving frameworks designed for high-throughput, low-latency inference can benefit a wide range of business applications, while auto-scaling and model versioning capabilities enhance the ease of use of cloud-native ML services.
Figures
PreviousNext
Review Article

Query parameters

Keyword:  MLOps

View options

Citations of

Views of

Downloads of