Advancing netCDF-CF for the Geoscience Community

This project aims to enhance netCDF-CF and increase its use in the geoscience community. To this end, the project team will organize community workshops focused on developing, refining, and reviewing enhancements to the standard. The team will add support for additional data types to netCDF-CF, broadening the range of geoscience domains whose data can be faithfully represented. This work will extend the standard’s mechanisms for representing spatial extents, develop extensions to support satellite data, and integrate an existing specification for radar data into the standard. It will also extend the standard to use advanced features of netCDF, creating enhancements to allow use of hierarchical data structures, support discrete samples, and take advantage of the data format’s structured metadata facilities.

Work on this project will take place in the context of the existing netCDF-CF community governing structure. The envisioned extensions will be developed in collaboration with other members of the netCDF-CF community, and can be approved for inclusion in the standard by following that community’s consensus-based modification process.

Project duration: September 2015 to August 2017 (extended to August 2018) 

First Workshop: May 24-26, 2016, Boulder CO (see Breakout Notes for results)

Second Workshop: Sept 6-8, 2017, Boulder CO (details forthcoming)

Overview

The Climate and Forecast (CF) metadata convention for netCDF (referred to here as netCDF-CF) is a community-developed standard first released in 2003. The netCDF-CF conventions were originally developed to represent climate and forecast model output encoded in the netCDF binary format, with the specific goal of facilitating comparison of output from different models. Subsequent development of the convention has broadened its scope to include observational data and derived products.

Storing data in compliance with the netCDF-CF standard benefits the geoscience community in several ways. Files are self-documenting, which means that researchers using the data know exactly which physical or derived quantities are stored. Units and names of quantities are standardized, reducing uncertainty when comparing data from different sources. And researchers can use common software tools and libraries, increasing their ability to share techniques and compare results.

This project aims to enhance netCDF-CF and increase its use in the geoscience community. To this end, the project team will organize community workshops focused on developing, refining, and reviewing enhancements to the standard. The team will add support for additional data types to netCDF-CF, broadening the range of geoscience domains whose data can be faithfully represented. This work will extend the standard’s mechanisms for representing spatial extents, develop extensions to support satellite data, and integrate an existing specification for radar data into the standard. It will also extend the standard to use advanced features of netCDF, creating enhancements to allow use of hierarchical data structures, support discrete samples, and take advantage of the data format’s structured metadata facilities.

Work on this project will take place in the context of the existing netCDF-CF community governing structure. The envisioned extensions will be developed in collaboration with other members of the netCDF-CF community, and can be approved for inclusion in the standard by following that community’s consensus-based modification process.

Intellectual Merit

The need to minimize the effort scientists expend to discover, understand, and use data is a major theme of EarthCube discussions. The EarthCube “End-User Principal Investigator” workshop held in 2013 identified  development and adoption of community conventions and standards for metadata, data, and software that facilitate data management, documentation, exchange, and analysis as a mechanism by which the significant effort to analyze each dataset for content determine how to integrate it with other data could be at least partly ameliorated. This proposal builds on an existing, widely-used geoscience community data and metadata standard and extends it to support a wider range of data types and geoscience domains. By enhancing data interoperability, this work gives researchers the keys to a wider world of geoscience data, allowing them to unlock new insights and make new connections.

Broader Impact

The netCDF-CF community is international in scope and continues to grow across scientific domains. While netCDF-CF currently promotes data interoperability within a limited set of geoscience domains, the work proposed here would extend the standard to better represent the datasets from a wider set of geoscience domains. The envisioned improvements to cross-domain data interoperability help diverse groups faithfully represent and capture the meaning in their particular types of data which has the potential to give rise to more effective, integrated decision-making tools for a wide range of communities.

Project Activities

Community collaboration and engagement

Adapting existing techniques and technologies

Numerous community-supported and commercial software tools exist to work with CF-compliant data. Examples include the netCDF-Java libraries, Ferret, the Live Access Server (LAS), the netCDF Operators (NCO), ESRI ArcGIS, and NASA Panoply. Because several of these software packages are developed and maintained by project team members, this project will focus on prototyping new netCDF-CF extensions using existing technologies rather than creating new tools.

Active engagement with the netCDF-CF governing structures

The project team includes members who have long been engaged in the netCDF-CF standards development process (Caron, Davis, Dixon, O’Brien, and Zender). Guided by the experience of these team members, the project team will use the netCDF-CF community’s established electronic forums (e-mail lists and GitHub discussions) to introduce potential extensions to the community, answer questions from other participants in the forums, and work to establish consensus in support of the proposed extensions.

Technical categories and activities

•    Improvements to netCDF-CF that support underrepresented geoscience domains
•    Improvements to netCDF-CF that support complex data footprints and spatial extents
•    Incorporation of the existing CFRadial conventions into netCDF-CF
•    Incorporation of features to support use with satellite data into netCDF-CF
•    Improvements to netCDF-CF to support Discrete Sampling Geometries
•    Incorporation of support for hierarchical data structures into netCDF-CF
•    Incorporation of support for hierarchical metadata structures into netCDF-CF
•    Additional enhancements to netCDF-CF based on recent advances in netCDF

For more details, contact Ethan Davis, lead PI. 

Quick Links

Workshop 1 (external site): May 24-26, 2016, Boulder CO (see Breakout Notes for results)

Project Files (google drive)

netCDF-CF draft extensions

Communications (telecons, forum, and mail list)

Community Standards (external sites)

CF Conventions

OGC netCDF-CF

 

Upcoming Events

Fall 2017 Workshop, September 6-8, 2017, Boulder CO (in planning)

 

Core Team

Ethan Davis (PI)

Mike Dixon

UCAR/Unidata 

NCAR/EOL

David Arctur (co-PI)

Tim Whiteaker

University of Texas at Austin

Charles Zender (co-PI)

University of California, Irvine

Nicholas Bond (co-PI)

Kevin O'Brien

University of Washington/JISAO

Aleksandar Jelenko (co-PI)

The HDF Group 

Dave Santek

University of Wisconsin / Space Sciences Engineering Center