EarthCube-funded Pangeo Scientists Publish Research in Frontiers in Climate Journal
Analysis-ready, cloud-optimized (ARCO) datasets unlock the power of cloud computing at scale for scientists tackling society's most urgent questions. The EarthCube-funded Pangeo Forge team published about their most recent ARCO research in the February issue of Frontiers in Climate. Their paper, entitled Pangeo Forge: Crowdsourcing Analysis-Ready, Cloud Optimized Data Production explains how ARCO data production is a notoriously difficult task, demanding specialized scientific as well as computational expertise, a fact which has historically limited its production.
Pangeo Forge is an open source framework for Extraction, Transformation, and Loading (ETL) of scientific data. This illustration is an example of a recipe in relation to Pangeo Forge architecture.
“Our paper introduces the technical design and implementation of Pangeo Forge,” said Lamont-Doherty Earth Observatory Data Infrastructure Engineer Charles Stern. “We also outline our future outlook for the platform, which offers data users including scientists, analysts, and students, the opportunity to curate a collection of ARCO data stores that are useful for their own work, and share both the provenance of those datasets, as well as the datasets themselves, with the community.”
Pangeo Forge, a new open-source platform, accelerates science by providing easy-to-use templates which open a door for a broader community of scientists to participate in ARCO data production. By crowdsourcing ARCO dataset creation, Pangeo Forge presents a foundation for the practice of reproducible, cloud-native, big-data ocean, weather, and climate science without relying on proprietary or cloud-vendor-specific tooling.
EarthCube is a community-driven activity sponsored by the National Science Foundation to transform research in the academic geosciences community. EarthCube aims to create a well-connected environment to share data and knowledge in an open, transparent, and inclusive manner, thus accelerating our ability to better understand and predict the Earth’s systems. EarthCube membership is free and open to anyone in the Geosciences, as well as those building platforms to serve the Earth Sciences. The EarthCube Office is led by the San Diego Supercomputer Center (SDSC) on the UC San Diego campus.
Kimberly Mann Bruch, San Diego Supercomputer Center Communications, firstname.lastname@example.org
Lynne Schreiber, San Diego Supercomputer Center EarthCube Office, email@example.com
Pangeo Forge: https://pangeo-forge.readthedocs.io/en/latest/index.html
San Diego Supercomputer Center: https://www.sdsc.edu/
UC San Diego: https://ucsd.edu/
National Science Foundation: https://www.nsf.gov/