Finalized:Saturday, November 1, 2014
Author(s):Leonard, L., and C. J. Duffy
The prototype discussed in this article retrieves Essential Terrestrial Variable (ETV) web services and uses data-model workflows to transform ETV data for hydrological models in a distributed computing environment. The ETV workflow is a service layer to 100's of terabytes of national datasets bundled for fast data access in support of watershed modeling using the United States Geological Survey (USGS) Hydrological Unit Code (HUC) level-12 scale. The ETV data has been proposed as the Essential Terrestrial Data necessary to construct watershed models anywhere in the continental USA (Leonard and Duffy, 2013). Here, we present the hardware and software system designs to support the ETV, data-model, and model workflows using High Performance Computing (HPC) and service-oriented architecture. This infrastructure design is an important contribution to both how and where the workflows operate. We describe details of how these workflow services operate in a distributed manner for modeling CONUS HUC-12 catchments using the Penn State Integrated Hydrological Model (PIHM) as an example. The prototype is evaluated by generating data-model workflows for every CONUS HUC-12 and creating a repository of workflow provenance for every HUC-12 (∼100 km2) for use by researchers as a strategy to begin a new hydrological model study. The concept of provenance for data-model workflows developed here assures reproducibility of model simulations (e.g. reanalysis) from ETV datasets without storing model results which we have shown will require many petabytes of storage.
Leonard, L., and C. J. Duffy, 2014: Automating data-model workflows at a level 12 HUC scale: Watershed modeling in a distributed computing environment. Environmental Modelling & Software, 61, 174–190, doi:10.1016/j.envsoft.2014.07.015.This material is based upon work supported by the National Science Foundation under Grant No. 1440332. Opinions, findings, conclusions or recommendations expressed are those of the authors and do not reflect the views of the NSF.