Scientific workflows provide the means to define, execute and reproduce computational experiments. However, reusing existing workflows still poses challenges for workflow designers. Workflows are often too large and too specific to reuse in their entirety, so reuse is more likely to happen for fragments of workflows. These fragments may be identified manually by users as sub-workflows, or detected automatically. In this paper we present the FragFlow approach, which detects workflow fragments automatically by analyzing existing workflow corpora with graph mining algorithms. FragFlow detects the most common workflow fragments, links them to the original workflows and visualizes them. We evaluate our approach by comparing FragFlow results against user-defined sub-workflows from three different corpora of the LONI Pipeline system. Based on this evaluation, we discuss how automated workflow fragment detection could facilitate workflow reuse.
Daniel Garijo, Oscar Corcho, Yolanda Gil, Boris A. Gutman, Ivo D. Dinov, Paul Thompson and Arthur W. Toga, 2014. FragFlow: Automated Fragment Detection in Scientific Workflows. Proceedings of the IEEE Conference on e- Science. Guarujua, Brazil., October 20-24, 2014, doi:10.1109/eScience.2014.32.This material is based upon work supported by the National Science Foundation under Grant No. 1343800, 1440323. Opinions, findings, conclusions or recommendations expressed are those of the authors and do not reflect the views of the NSF.