Building Block on Integrating Discrete and Continuous Data:
Bridging the “digital divide” between hydrologic and atmospheric sciences
Concept map of potential connections with other EarthCube Building Blocks (April 2013)
While we seek to understand similar phenomena in atmospheric data and water resource data, such as precipitation and evaporation, the information and tools for studying these phenomena are quite different. In the hydrological science studied at CUAHSI, there is an emphasis on time series of data collected at point locations, and a study typically involves a narrow region of space, a deep horizon of time, and synthesis of observed and modeled information for a few variables over this domain of space and time. In the atmospheric science information conveyed by Unidata, the focus is on near-term synoptic scale observations and modeling of a large number of interrelated variables conveyed as multidimensional arrays covering a large region of space for a short interval of time.
Fundamentally, these various classes of information can be described as spatially discrete for collections of points, lines and areas representing geographical features and their location over a study domain; and spatially continuous for raster grids and multidimensional arrays that are spatially distributed in a regular fashion through a study domain. Time series are more readily shared in hydrology, while gridded arrays are more readily shared in atmospheric science. There are also semantic differences in the way the same variables are described in hydrologic and atmospheric sciences – precipitation can have different names and meanings depending on the information source. We are seeking to develop a common information model and tools for unambiguous, interoperable exchange of data across these domains.
The first step is to develop a common information model for representing both discrete and continuous data, having various temporal scales. Whereas one time series represents one variable for hydrology, a single gridded array can describe many variables and time sampling schemes. CUAHSI has developed an XML-based data structure called WaterML, which describes the type of observation and then presents the time series data for one variable. Unidata works primarily with netCDF, which implements the multidimensional array structure. Because this multidimensional array approach is the more flexible, we are investigating best ways of putting WaterML content into netCDF’s information model, using the Climate & Forecasting (CF) Conventions version 1.6 with Discrete Sampling Geometries.
As we progress with this, we are also developing tools to better visualize these relationships in space & time, which should help enable other domains of geosciences to take advantage of this approach.
Scientists in hydrology, hydraulics, atmospherics, meteorology and informatics are involved in this project, making sure we’re solving the right problems and coming up with correct results. Our demonstration project is a National Flood Interoperability Experiment, sponsored by USGS, NOAA National Weather Service, US Corps of Engineers, and FEMA. We have state-level engagement with the Iowa Flood Center, the Texas Division of Emergency Management, and engagement with 20 participating institutions.
Benefits to Scientists
Besides the direct benefits to climate, weather, and hydrology scientists, this work can be applied in other domains working with multidimensional spatial and temporal data, such as for seismic analysis, soil chemistry changes over deep time, oceanographic data, and many others.