Operationalizing MLOps with Databricks Pipelines: Scalable Machine Learning in Cloud Environments

Authors

  • Uttama Reddy Sanepalli Fidelity Investments, NC, USA Author

DOI:

https://doi.org/10.32628/CSEIT25113573

Keywords:

Databricks Pipelines, MLOps, Machine Learning Deployment, Model Governance, MLflow, Delta Lake, Cloud Computing, Model Monitoring, Data Engineering, Scalable AI Systems

Abstract

The operationalization of machine learning models at scale remains a central challenge for data-driven enterprises due to complexities in deployment automation, governance, and post-deployment performance management. This article presents a structured MLOps framework leveraging Databricks pipelines to enable scalable, secure, and continuously monitored machine learning model deployment in cloud environments. The proposed approach integrates Delta Lake and Delta Live Tables for reliable data ingestion and transformation, MLflow for experiment tracking and model lifecycle management, and Databricks Model Serving for real-time and batch inference. The study demonstrates how automated pipelines support end-to-end orchestration of data preprocessing, distributed model training, hyperparameter optimization, deployment, and continuous performance monitoring. Built-in observability mechanisms and drift detection techniques are employed to ensure sustained model accuracy and reliability in production. By utilizing cloud-native infrastructure across AWS and Azure, the framework enhances scalability, fault tolerance, and operational efficiency while reducing manual intervention. The results highlight measurable improvements in deployment speed, model governance, and system reliability, underscoring the effectiveness of Databricks-based MLOps pipelines for enterprise-grade machine learning systems.

Downloads

Download data is not yet available.

References

Zaharia, M., et al., “Apache Spark: A Unified Engine for Big Data Processing,” Communications of the ACM, vol. 59, no. 11, pp. 56–65, 2016. DOI: https://doi.org/10.1145/2934664

Breck, E., et al., “The ML Test Score: A Rubric for ML Production Readiness,” Proc. IEEE Big Data, 2017. DOI: https://doi.org/10.1109/BigData.2017.8258038

Sculley, D., et al., “Hidden Technical Debt in Machine Learning Systems,” NIPS, 2015.

Baylor, D., et al., “TFX: A TensorFlow-Based Production-Scale ML Platform,” Proc. KDD, 2017. DOI: https://doi.org/10.1145/3097983.3098021

Amershi, S., et al., “Software Engineering for Machine Learning: A Case Study,” ICSE, 2019. DOI: https://doi.org/10.1109/ICSE-SEIP.2019.00042

Sandeep Kamadi. (2022). Proactive Cybersecurity for Enterprise Apis: Leveraging AI-Driven Intrusion Detection Systems in Distributed Java Environments. International Journal of Research in Computer Applications and Information Technology (IJRCAIT), 5(1), 34-52. https://iaeme.com/MasterAdmin/Journal_uploads/IJRCAIT/VOLUME_5_ISSUE_1/IJRCAIT_05_01_004.pdf DOI: https://doi.org/10.34218/IJRCAIT_05_01_004

Polyzotis, N., et al., “Data Lifecycle Challenges in Production Machine Learning,” SIGMOD, 2018. DOI: https://doi.org/10.1145/3299887.3299891

Gujjala, Praveen Kumar Reddy. (2022). Data science pipelines in lakehouse architectures: A scalable approach to big data analytics. World Journal of Advanced Research and Reviews. 16. 1412-1425. 10.30574/wjarr.2022.16.3.1305. DOI: https://doi.org/10.30574/wjarr.2022.16.3.1305

Shankar, S., et al., “Model Monitoring and Model Maintenance,” Stanford ML Systems Seminar, 2019.

Villalobos, M., et al., “MLOps: Continuous Delivery and Automation Pipelines in Machine Learning,” IEEE Software, vol. 38, no. 5, pp. 56–63, 2021.

Hummer, W., et al., “ModelOps: Cloud-Based Lifecycle Management for ML Models,” IEEE Cloud Computing, vol. 6, no. 2, pp. 28–35, 2019. DOI: https://doi.org/10.1109/IC2E.2019.00025

Sandeep Kamadi. (2022). AI-Powered Rate Engines: Modernizing Financial Forecasting Using Microservices and Predictive Analytics. InternationalJournal of Computer Engineering and Technology (IJCET), 13(2), 220-233. https://iaeme.com/MasterAdmin/Journal_uploads/IJCET/VOLUME_13_ISSUE_2/IJCET_13_02_024.pdf DOI: https://doi.org/10.34218/IJCET_13_02_024

MLOps Definition and Benefits https://www.databricks.com/glossary/mlops

AI and machine learning on Databricks https://docs.databricks.com/aws/en/machine-learning/

Delta Live Tables Databricks Documentation https://docs.databricks.com/aws/en/delta-live-tables/

Chandra Sekhar Oleti. (2022). Serverless Intelligence: Securing J2ee-Based Federated Learning Pipelines on AWS. International Journal of Computer Engineering and Technology (IJCET), 13(3), 163-180. https://iaeme.com/MasterAdmin/Journal_uploads/IJCET/VOLUME_13_ISSUE_3/IJCET_13_03_017.pdf DOI: https://doi.org/10.34218/IJCET_13_03_017

MLOps workflows https://docs.databricks.com/aws/en/machine-learning/mlops/mlops-workflow

Aritra Ghosh, How to orchestrate MLOps by using Azure Databricks? https://www.linkedin.com/pulse/how-orchestrate-mlops-using-azure-databricks-aritra-ghosh/

Downloads

Published

25-12-2024

Issue

Section

Research Articles

How to Cite

[1]
Uttama Reddy Sanepalli, “Operationalizing MLOps with Databricks Pipelines: Scalable Machine Learning in Cloud Environments”, Int. J. Sci. Res. Comput. Sci. Eng. Inf. Technol, vol. 10, no. 6, pp. 2544–2552, Dec. 2024, doi: 10.32628/CSEIT25113573.