View the Geo-Semantics Forum

Many challenges hinder the seamless integration of models with data. These challenges compel scientists to perform the integration process manually. The primary challenges are a consequence of the knowledge latency between model and data resources and others are derived from inadequate adoption and exploitation of information technologies. Knowledge latency challenges increase exponentially when a user aims to integrate long-tail data (data collected by individual researchers or small research groups) and long-tail models (models developed by individuals or small modeling communities). We focus on these long-tail resources because despite their often-narrow scope, they have significant impacts in scientific studies and present an opportunity for addressing critical gaps through automated integration. The goal of this research is to develop a framework rooted in semantic techniques and approaches to support “long-tail” models and data integration.  

Science Challenges 

Incorporation of semantics in data and models life cycle for advancing:

  • Data-model integration: overcoming the semantic heterogeneity of the rapidly growing data and model collections, will allow their seamless integration.
  • Data discovery: semantics will minimize the  data discovery gap over the web, which is increasing tremendously and limits their reusability and interoperability.
  • Data synthesis: linking data based on their information profile will minimize the complexity of data synthesis.
  • Model-Model Coupling: Ensuring the semantic consistency of quantities exchanged between models and providing tools for the alignment of their information profiles is essential for crossdisciplinary model coupling.

  

Science Drivers

The GeoSemantics framework will directly augment the multidisciplinary interaction between different geoscience communities. We are building on two existing technologies: (1) SEAD (Sustainable Environmental Actionable Data), and (2) CSDMS (Community Surface Dynamics Modeling System). We are also collaborating with on ongoing EarthCube initiatives including GeoSoft, ESB (Earth System Bridge), and SEN (Sediment Experimentalist Network), and eWELL (Workforce Education and Learning Library).  We are building a flexible information system that is capable of increasing the interoperability of scientific data and models by:

  • creating standard tools for associating descriptive information with data and models.
  • allowance of crosswalks between Controlled Vocabularies.
  • provision of a low-barrier technology for scientists, who are not expert in information systems to contribute their information or update the existing information.

Project Summary 

  • Motivations: Driving motivations for advancing the interoperability of data and models are: (i) increasing the productivity of scientists and research groups,(ii) repurposing quality ed resources for new research objectives, (iii) providing flexibility for interdisciplinary research and collaboration, and (iv) enhancing the quality of available resources.
  • Vision: Support the semantic interoperability between the rapidly growing long-tail models and data resources, by using the Linked Data and micro-web services approaches.
  • Goals: Development of a decentralized knowledge-based platform that allows semantically heterogeneous systems to interact with minimum human intervention. 
  • Design Concept: We are building a collaborative knowledge management system that ingests the available standards and supports the formalization of semantic definition for physical process across geoscience communities, and provides web services that allow the semantic mediation and matching between resources including data and models.
  • Keywords: Data Discovery, Model-Data Integration, Semantic Interoperability, Linked Data, Micro-services Ontologies, Metadata

Technical Approach 

Geosemantics framework is a decentralized framework that combines the Linked Data and RESTful web services to annotate, connect, integrate, and reason about integration of geoscience resources. The framework allows the semantic enrichment of web resources and semantic mediation among heterogeneous geoscience resources, such as models and data.

  • It uses micro-service architecture to close the semantic loop among data, models, and Controlled Vocabularies (CVs).
  • It provides three sets of micro-services:
  1.  Knowledge Integration Services (KIS), which ingests, registers, and checks-in Controlled Vocabularies and W3C standards to the framework’s Knowledge-base;
  2. Semantic Annotation Services (SAS), which annotates resources with their spatiotemporal context, variable, and provenance relationships, either by running automatic extractors based on the data files MIME type (e.g. GeoTIFF and CSV types) or by providing an interactive interface for manual annotation;
  3.  Resource Alignment Service (RAS), which is a scientific workflow to align the attributes associated with two geo-resources to ensure their semantic consistency before integration. 

 

Benefits to Scientists 

  • Advances the interoperability of model and data resources using semantic annotations
  • Allows the cross walks between standard names using a collaborative knowledge management system.
  • Augments the semantic mediation and matching between models and data with minimum human intervention.

Resources: http://hcgs.ncsa.illinois.edu/ 

 

Publications:

  • Elag, M. M., and Kumar, P., Marini, L., Myers, J. D., Hedstrom, M, and Plale, A. B. (2014) Characterization of Emergent Data Networks among Long-Tail Data, Abstract Vol. 16, EGU2014-7844-1, 2014, EGU General Assembly 2014.
  • Elag, M. M., Kumar, P., Marini, L, Lui, R., Jiang, P., (2015) Geo-Semantic Framework for Integrating Long-Tail Data and Model Resources for Advancing Earth System Science, EarthCube All-hands meeting, Arlington, VA, 26-28 May, 2015
  • Elag, M. M., Kumar, P. (2015) GeoSemantic Framework for Integrating Data and Model Resources For Advancing Earth System Science, 3rd CUAHSI Conference on Hydroinformatics meeting, Tuscaloosa, Alabama, 14-17 July, 2015
  • Jiang, P., Elag, M. M., and Kumar, P. (2015) Geosemantic Resource Alignment Service. CSDMS annual meeting, CO, 26-28 May, 2015.
  • Elag, M. M., and Kumar, P. (2015) Semantic Annotation Framework for Long-tail Resources, Research Data Alliance Fourth Plenary Meeting, San Diego, CA, 6-10 March, 2015.

 

Webinars:

1- EC-Geosemantics Webinar