Workshop on MLOps Systems

To be held along with Third Conference on Machine Learning and Systems (MLSys)

Austin, TX

March 4, 2020

About

Due to the complexity in putting ML into production, the actual machine learning capability is a small part of a complex system and its lifecycle. This new evolving field is known as MLOps. Informally MLOps typically refers to the collaboration between data scientists and operations engineers (e.g. SRE) to manage the lifecycle of ML within an organization. This space is new and has yet to be explored from a research perspective.

In this workshop we aim to cover research problems in MLOps, including the systems and ML challenges involved in this process. We will also cover the software engineering questions including specification, testing and verification of ML software systems. We will bring together a wide variety of experts from both industry and academia, covering persona ranging from data scientists to machine learning engineers.

Call for Papers

We solicit short/position papers (2 pages + references) that will be presented as posters. The page limit includes figures, tables, and appendices, but excludes references. Please use standard LaTeX or Word ACM templates. All submissions will need to be made via EasyChair: submit your paper here.

Topics that are relevant to this workshop include:

  • ML model specification

  • ML model management

  • ML model or Concept drift/change detection

  • ML training pipelines specification, verification

  • ML model monitoring

  • ML model serving techniques and systems

  • ML model specification

  • ML Experiment tracking and management

  • System design for metadata management systems

  • Audits, assurance, security and compliance for MLOps.

  • ML CI/CD

  • Scheduling and Cost Optimization of ML workflows

  • Debugging ML workflows and pipelines

  • Hyperparameter tuning systems

  • MLops for Federated, Split and Distributed Learning



Important dates


Paper submission deadline: January 15, 2020 (submit here)
Author Notification: January 27, 2020
Camera-ready papers due: February 21, 2020
Workshop Date: (tentative) March 4, 2020
All deadlines are at midnight anywhere on earth (AoE), and are firm.

Program

Preliminary Schedule

Time Slot Title Authors
9:00-9:05 Introductions Ce, Debo, Matei
9:05-9:35 Invited Talk: Automating data quality ops Theodoros (Theo) Rekatsinas, U. Wisconsin (Madison)
9:35-10:05 Invited Talk: Overton Chris Re, Stanford/Apple
10:05-10:30 Break and Poster Setup
10:30-11:00 Invited Talk: Elements of Learning Systems Tianqi Chen, U. Washington.
11:00-11:30 Invited Talk: Building a Trusted Process: How MLOps can Enable Responsible AI Development Sarah Bird, Microsoft
11:30-12:30 Lightning Talks (poster introductions) Various
12:30-14:00 Lunch break and Posters
14:00-14:30 Invited Talk: ML Prediction Serving Jospeh Gonzalez, UC Berkeley
14:30-15:00 Invited Talk: ML Engineering: Emergence, Evolution, and Lessons Learned from Google's Production ML Pipelines Alkis (Neoklis) Polyzotis, Google
15:00-15:30 Coffee break and Posters
15:30-16:00 Invited Talk: TFX, Kubeflow, and MLflow: An incomplete history of ML infrastructure and the emergence of MLOps Clemens Mewald, Databricks
16:00-16:30 Invited Talk: Model Versioning: A keystone for Robust MLOps Manasi Vartak, Verta.ai
16:30-17:00 Wrap up Ce, Debo, Matei

Accepted Posters

  • sensAI: Fast ConvNets Serving on Live Data via Class Parallelism. Guanhua Wang, Zhuang Liu, Siyuan Zhuang, Brandon Hsieh, Joseph Gonzalez and Ion Stoica.
  • Towards Automated ML Model Monitoring: Measure, Improve and Quantify Data Quality. Tammo Rukat, Dustin Lange, Sebastian Schelter and Felix Biessmann.
  • Towards Automating the AI Operations Lifecycle. Matthew Arnold, Jeff Boston, Michael Desmond, Evelyn Duesterwald, Benjamin Elder, Anupama Murthi, Jiri Navratil and Darrell Reimer.
  • Efficient Scheduling of DNN Training on Multitenant Clusters. Deepak Narayanan, Keshav Santhanam, Amar Phanishayee and Matei Zaharia.
  • Towards Complaint-driven ML Workflow Debugging. Weiyuan Wu, Lampros Flokas, Eugene Wu and Jiannan Wang.
  • PerfGuard: Deploying ML-for-Systems without Performance Regressions. H M Sajjad Hossain, Lucas Rosenblatt, Gilbert Antonius, Irene Shaffer, Remmelt Ammerlaan, Abhishek Roy, Markus Weimer, Hiren Patel, Marc Friedman, Shi Qiao, Peter Orenberg, Soundarajan Srinivasan and Alekh Jindal.
  • Implicit Provenance for Machine Learning Artifacts. Alexandru A. Ormenisan, Mahmoud Ismail, Seif Haridi and Jim Dowling.
  • Addressing the Memory Bottleneck in AI Model-Training. David Ojika, Bhavesh Patel, G Anthony Reina, Trent Boyer, Chad Martin and Prashant Shah.
  • Simulating Performance of ML Systems with Offline Profiling. Hongming Huang, Peng Cheng, Hong Xu and Yongqiang Xiong.
  • A Viz Recommendation System: ML Lifecycle at Tableau. Kazem Jahanbakhsh, Eric Borchu, Mya Warren, Xiang-Bo Mao and Yogesh Sood.
  • CodeReef: an open portal for cross-platform MLOps and reproducible benchmarking. Grigori Fursin, Herve Guillou and Nicolas Essayan.
  • Towards split learning at scale: System design. Iker Rodríguez, Eduardo Muñagorri, Alberto Roman, Abhishek Singh, Praneeth Vepakomma and Ramesh Raskar.
  • MLBox: Towards Reproducible ML. Victor Bittorf, Xinyuan Huang, Peter Mattson, Debojyoti Dutta, David Aronchick, Emad Barsoum, Sarah Bird, Sergey Serebryakov, Natalia Vassilieva, Tom St. John, Grigori Fursin, Srini Bala, Sivanagaraju Yarramaneni, Alka Roy, David Kanter and Elvira Dzhuraeva.
  • Conversational Applications and Natural Language Understanding Services at Scale. Minh Tue Vo Thanh and Vijay Ramakrishnan.
  • Towards Distribution Transparency for Supervised ML With Oblivious Training Functions. Moritz Meister, Sina Sheikholeslami, Robin Andersson, Alexandru Ormenisan and Jim Dowling.
  • Tools for machine learning experiment management. Vlad Velici and Adam Prügel-Bennett.
  • MLPM: Machine Learning Package Manager. Xiaozhe Yao.
  • Common Problems with Creating Machine Learning Pipelines from Existing Code. Katie O’Leary, Makoto Uchida.

Organization


Organizing Committee

  • Debo Dutta, Cisco Systems

  • Ce Zhang, ETH Zurich

  • Matei Zaharia, Stanford University & Databricks


Program Committee

  • Debo Dutta, Cisco Systems

  • Ce Zhang, ETH Zurich

  • Matei Zaharia, Stanford University & Databricks