Exploration of Workflow Management Systems Emerging Features from Users' Perspectives
Event Type
Extreme Scale Computing
Scalable Computing
Scientific Workflows
TimeSunday, 17 November 201912:15pm - 12:30pm
DescriptionThere has been a recent emergence of new workflow applications focused on data analytics and machine learning. This emergence has precipitated a change in the workflow management landscape, causing the development of new data-oriented workflow management systems (WMSs) as opposed to the earlier standard of task-oriented WMSs. In this paper, we summarize three general workflow use-cases and explore the unique requirements of each use-case in order to understand how WMSs from both workflow management models (task-driven workflow management models and data-driven workflow management models) meet the requirements of each workflow use-case from the user's perspective. We analyze the applicability of the two main workflow models by carefully describing each model and by providing an examination of the different variations of WMSs that fall under the task-driven model. To illustrate the strengths and weaknesses of each workflow management model, we summarize the key features of four production-ready WMSs: Pegasus, Makeflow, Apache Airflow, and Pachyderm. Of these production-ready WMSs, three belong to the task-driven workflow management model (i.e., Pegasus, Makeflow, Apache Airflow) and one belongs to the data-driven workflow management model (i.e., Pachyderm). To deepen our analysis of the four WMSs examined in this paper, we implement three real-world use-cases to highlight the specifications and features of each WMS. The application of these real-world use-cases demonstrates how each workflow management model operates with the different applications. We present our final assessment of each WMS after considering the following factors: usability, performance, ease of deployment, and relevance. The purpose of this work is to offer insights from the user's perspective into the research challenges that WMSs currently face due to the evolving workflow landscape.