EarthCube Tools Designed for Scientists as End Users

These are finished tools (or nearly finished tools) that were designed for scientists as end users rather than as internal components of an EarthCube architecture.  Links: (YouTube playlist of EarthCube Tools.)

Table Key

Readiness: (on a 1-5 scale)

  • 1/2/3 – “in progress”
  • 4 – “almost ready”
  • 5 – “ready to use”

Advancing netCDF-CF

Short Tool Description:

Increase the types of data that can be represented as netCDF-CF data to better support a larger segment of the earth system science community.

Tool category: 1) Standard data format; 2) Data access, analysis, and visualization

Readiness: (5) Gridded data; Timeseries, soundings, aircraft tracks; Unstructured grids (e.g., triangular mesh); CF-Radial: radial data for radar and lidar

(4): Timeseries for a polyline or polygon

(1-3): Satellite swath data; Data quality and uncertainty

Scientists Sought: Scientists with data they would like to make more accessible in a variety of tools. Scientists interested in tools that handle standard compliant data.

Contact: Ethan Davis

Links: Slides   Video  GitHub

CHORDS

Short Tool Description:

Cloud-Hosted Real-time Data Services for the Geosciences (CHORDS) is a real-time data services infrastructure that will provide an easy-to-use system to acquire, navigate and distribute real-time data streams via cloud services and the Internet. It will lower the barrier to these services for small instrument teams, employ data and metadata formats that adhere to community accepted standards, and broaden access to real-time data for the geosciences community.

Tool category: Real-time data

Readiness: (4) In use by ”friendly” users.

Scientists Sought: Scientists who would like to manage their real-time data online and provide them in standard formats; Scientists who would like to use real-time data in their experiments.

Contact: Mike Daniels

Links: Slides   Video

data discovery studio web interfaceData Discovery Studio

Short Tool Description:

Data discovery and exploration for geosciences. Now features a geoportal interface with over 1,000,000 searchable records. Any user can contribute links to favorite resources so those repositories and datasets become searchable. Hosts a large inventory of high quality geoscience information resources, with standard metadata and traceable provenance. Improves metadata descriptions via a scalable metadata augmentation pipeline. Enables standards-based data discovery across the geosciences.

Tool category: Data discovery (with filtering, spatial selection, metadata enhancement) & data exploration workbench; built on CINERGI search engine

Readiness: (5)  Active improvement through the EarthCube Data Discovery Hub project; inventory and functions continuously extended

Scientists Sought: researchers with new discovery use cases; repository data curators; communities

Contact: Ilya Zaslavsky

Links: Slides  Video

Digital Crust: Macrostrat Component

Short Tool Description:

The Macrostrat component of Digital Crust offers a comprehensive, general geological description of the upper crust. Geological maps, geological columns that include the subsurface, and a wide range of data linked to rock units are available. A mobile application (Rockd) built on this infrastructure allows users to make field observations and link them to existing geological data. A 3D gridded permeability model has also been produced and is available.

Tool category: Geological data aggregation, relation, distribution and analysis.

Readiness: (5)

Scientists Sought: Geoscientists whose research intersects the upper crust and application developers who need API-based access to geological data and gridded models of rocks and their properties.

Contact: Shanan Peters

Links: Slides  Video

drilsdown

Drilsdown iPython_IDV

Short Tool Description:

3D visualizations in the IDV can be logged in a Jupyter notebook and published in a RAMADDA repository as a "Case Study" object. "Teleport" functionality for IDV allows Case Studies to be batch-created (with data fetched) from a list of lat-lon-time coordinates, ready for quick nimble human inspection.

Tool category: Visualization and case study documentation

Readiness: (5) Features will be enhanced, but the core works already

Scientists Sought: Atmosphere and ocean 3D dynamics

Contact: Brian Mapes

Links: Video

Earth System Bridge

Short Tool Description:

The primary goal of Earth System Bridge is to create interoperability of models, despite the fact that they may have been created with a wide variety of frameworks, conventions, programming languages, and variable names.

Tool category: Model interoperability

Readiness:

Scientists Sought:

Contact: Scott Peckham

Links: Slides  Video

EarthCollab

Short Tool Description:

New systems to find research resources (data, projects, publications), and people with particular expertise

Tool category: Resource discovery (e.g. data, information, projects); and information sharing

Readiness: (5) Connect UNAVCO & Arctic Data Connects; (3) VIVO Cross-linking software development

Scientists Sought: Geoscientists and other data users (e.g. educators or other public stakeholders), Geoinformatics experts

Contact: Matt Mayernik

Links: Slides   Video

EarthLife Consortium API

Short Tool Description:
The Earth-Life Consortium (ELC) seeks to make all paleobiological data easily discoverable, accessible, and analyzable, with the larger goal of understanding the interactions between the Earth’s biological and geophysical systems across all timescales of the Earth’s history. Initial efforts are focusing on building a common search interface for paleobiological and paleoecological data stored in the Paleobiology Database and Neotoma Paleoecology Database. Other researchers and organizations interested in joining the Earth-Life Consortium are encouraged to contact the PIs listed on this page.

Tool category: paleontological data resource
Readiness: (5) A few bells and whistles will be added, but it is ready to use now.
Contact: Mark D. Uhen

ECITE - EarthCube Integration and Test Environment

Short Tool Description:

ECITE provides access to cloud-based computational resources and facilitates assessment and evaluation of technologies, ensuring compatibility with EarthCube interoperability and integration criteria. Its EarthCube Assessment Framework organizes science use cases for technology assessment toward use case solutions and identification of remaining gaps.

Tool category: Test bed, prototype

Readiness: (4) As a finished prototype, functionality by definition is "almost ready"

Scientists Sought: technology developers and evaluators

Contact: Sara Graves
Links: Slides  Video

ECOGEO Virtual Machine

Short Tool Description:

Provides introduction to using the command line to run bioinformatic tools. Contains a virtual machine with all necessary data sets and tools, alongside presentations and workflows.

Tool category: Temporary workbench

Readiness: (5)

Scientists Sought: Anyone looking to use ‘omics tools to answer research questions.

Contact: Elisha Wood-Charlson

Links: Slides  Video

Ensemble Toolkit

Short Tool Description:
Ensemble Toolkit (EnTK) is a Python framework for developing and executing applications comprised of multiple sets of tasks, aka ensembles. EnTK has the following unique features: (i) abstractions that enable the expression of various task graphs, (ii) abstraction of resource management and task execution, (iii) Fault tolerance as a first order concern and (iv) well-established runtime capabilities to enable efficient and dynamic usage of grid resources and supercomputers.
 
Tool category: ensemble execution system                  
Readiness: (5) Currently used by domain scientists in molecular science, climate science, seismology, polar sciences; tested on several HPC systems.
Scientists Sought: EnTK is invariant to the application workload and the target resource. EnTK can be used by any scientist where their application consists of multiple ensembles of tasks (mpi/multi-threaded/serial/gpu).
Contact: Vivek Balasubramanian , Matteo Turilli , Shantenu Jha

 

 

ePANDDA enhancing Paleontological and Neontological Data Discovery API

Short Tool Description:
The ePANDDA API provides synthetic information about organisms in space and time. This includes their geographic occurrence, mention in publications, location in specimen repositories such as museums, and links to media (images and 3D scans) for both modern and fossil taxa.
Tool category:
Application Programming Interface (API)
Readiness:
(4) prototype version is available and functional, not completely finished.
Scientists Sought:
biogeographers, systematists, functional morphologists, evolutionary biologists, ecologists, climatologists, conservation biologists, oceanographers, and petroleum geologists
Contact: Jocelyn Sessa,
Links: Slides:

Flyover Country

Short Tool Description:

An offline mobile app for geoscience outreach and data discovery. Offline geologic maps and interactive points of interest reveal the locations of fossils and georeferenced Wikipedia articles visible from your airplane window seat, vehicle, or hiking trail vista. Download through the Apple AppStore or GooglePlay.

Tool category: Digital resources for both field work and public outreach

Readiness: (5)

Scientists Sought: Available to all geoscientists as well as members of the public.

Contact: Amy Myrbo

Links: Slides  Video

GeoDataspace/  GeoTrust

Short Tool Description:

Assists scientists and communities in creating and maintaining collections of data and model runs for specific research projects. Example: a GeoDataspace for a collaborative model would provide a single handle to various model-related data items and source codes, offering benefits of shareability, reusability, and reproducibility during model development, testing, and validation. 

Tool category: Reproducibility, collaboration

Readiness: (5) container, (3) reproducibility

Scientists Sought: All scientists interested in reproducible science.

Contact: Tanu Malik

Links: Slides  Video

 

GeoDeepDive

Short Tool Description:

Digital library backed by publisher agreements and computing infrastructure that has pre-processed documents with OCR/NLP tools, indexed full text against domain dictionaries; example software to use, basic discovery-focused API available

Tool category: Published scientific literature

Readiness: (5)

Scientists Sought: Anyone with need to programmatically read published literature and extract/summarize information from it.

Contact: Shanan Peters

Links: Slides  Video

     

GeoSemantics

Short Tool Description:

Enable interoperability of heterogeneous model and data resources developed/produced by scientists and data professionals.

Tool category: Integrating Long-tail Models and Data

Readiness: (4)

Scientists Sought: data & model providers and users looking for tools to integrate model and data in a cloud platform, enrich semantic information of their data; & conduct semantic search among the annotated data.

Contact: Praveen Kumar

Links: Slides   Video

 

 

 

ICEBERG

Short Tool Description:
This tool makes it easier to apply workflows on high-resolution satellite imagery at very large spatial extents. Our use cases span a number of disparate applications including biological feature detection, land cover classification, finding hydrological features, and terrain modelling. The common element of all ICEBERG's applications is the use of very large image databases that require the use of high performance and/or distributed computing for completion, and the development of tools to enable image processing using open source tools that can be parallelized across a computing cluster.

Tool category: computing tools for imagery analysis
Readiness: (3), We expect to release an initial version of the seal detection use case by the end of July 2018
Scientists Sought: Anyone using high-resolution imagery for classification or analysis
Contact: Heather Lynch
Links:Slides 

Video

iMicrobe

Short Tool Description:

iMicrobe provides users with a freely available web-based platform to: (1) maintain and share project sequence data, relevant contextual metadata, and analysis products, (2) search for related public data sets, and (3) run analysis tools on highly-scalable computing resources.

Tool category: Discovery and analysis platform for microbial sequence data

Readiness: (5) iMicrobe is fully function and contains many data sets and tools for users to discover public data, combine with private data to create unique data sets, and run tools on HPC for analysis and visualization.

Scientists Sought:  Anyone curious about microbial process in Earth systems.

Contact: Bonnie Hurwitz

iSamples

Short Tool Description:

The iSamples RCN aims to improve the discovery, access, and sharing of physical samples by promoting best practices, inclusing the use of the IGSN (International Geo Sample Number). iSamples has developed customizable Sample Management Training Modules and has facilitated development of MARS (Middleware for Assisting Registration of Samples).

Tool category: Unique Identifiers, Community Activities

Readiness: (5)

Scientists Sought: Anyone who works with physical samples or data generated from them

Contact: Kerstin Lehnert, Megan Carter

Links: Slides   Video

LinkedEarth

Short Tool Description:

LinkedEarth is an EarthCube-funded project aiming to better organize and share Earth Science data, especially paleoclimate data. LinkedEarth facilitates the work of paleoclimatologists by empowering them to curate their own data and to use cutting-edge data-analytical methods tailored to them.

Tool category: Community Activity (standard development in paleoclimatology)

Readiness: (4) completing development of a data standard will get this tool to (5)

Scientists Sought: paleoclimatologists and other paleogeoscientists, climate modelers, climate dynamicists

Contact: Julien Emile-Geay

Links: Slides   Video

metatryp logoMETATRYP

Short Tool Description:

A web interface to examine the presence of peptide sequences within marine microbial genomes and metagenomes to infer least common ancestor taxonomic information. This program also provides the taxonomic attribution capability running behind the Ocean Protein Portal through an API.

Tool category: Data search and discovery

Readiness: (5)

Scientists Sought:  Proteomics domain scientists involved in interpreting ocean protein data and designing mass spectrometry assays for protein quantitation.

Contact: Mak Saito, Danie Kinkade, metatryp@whoi.edu

ocean protein portal logo

Ocean Protein Portal

Short Tool Description:

A web portal to search for the occurrence of proteins within ocean metaproteomic datasets, examine their distributions and their taxonomic attribution

Tool category: Data resource for discovery and access, analysis, and visualization

Readiness: (4)

Scientists Sought: Scientists interested in proteins in the environmental settings, including oceanographers, biochemists, biogeochemists, and microbial oceanographers. Also of educational use for chemistry and oceanography classes.

Contact: Mak Saito, Danie Kinkade, oceanproteinportal@whoi.edu

OntoSoft

Short Tool Description:

OntoSoft is a software metadata registry that contains semantic descriptions for hundreds of geosciences models and other useful software.  The descriptions are geared to scientists.

Tool category: Software Registry; Training

Readiness: (5)

Scientists Sought: Anyone can add descriptions of their own software to the repository.  Several communities have set up sites that are federated with OntoSoft.

Contact: Yolanda Gil

Links: Slides  Video

 

 

Pangeo
  
Short Tool Description:
Pangeo is a general-purpose python computational environment for working with Big Geoscience Data. It allows you to leverage a high-performance computing system or cloud computing cluster to scale your python analysis to extremely large datasets.
Tool category: Big Data, Python, netCDF
Readiness:(5) Hundreds of scientists are already using Pangeo
Scientists Sought:Our users already include climate / ocean / atmosphere scientists working with large netCDF-style datasets. We are interested in exploring the application of Pangeo to solid-earth geophysics and are actively seeking collaborators in that field.
Contact: We prefer users to interact with our team via GitHub rather than email:
Links:  Slides
Video 1
Video 2

Seaview

Short Tool Description:

SeaView creates deeply integrated data collections, drawing oceanographic data from multiple repositories around scientific themes, and providing them in ODV and netCDF formats.

Tool category: Data Resource

Readiness: (5) four data collections ready to use

Scientists Sought: oceanographers interested in integrated water column data collections.

Contact: Karen Stocksseaviewdata@gmail.com

Links: Slides   Video   Data collections

 

Sediment Experimentalist Network-Knowledge Base (SEN-KB)

Short Tool Description:
The Sediment Experimentalist Network Knowledge Base (SEN-KB) is a resource for researchers in Earth-surface and sedimentary research communities to exchange information about datasets, facilities, methods, equipment, and workflows for laboratory experiments. The website is a collaborative wiki that is easy to search and access. Though SEN-KB does not itself host datasets, the Sediment Experimentalist Network Research Coordination Network (SEN RCN) has partnered with the Sustainable Environmental Actionable Data (SEAD: http://sead-data.net/) for support in storing and publishing datasets associated with entries in SEN-KB.

Tool category: Resource discovery (data, workflows)
Readiness: (5) But we are always looking for feedback to improve usability.
Scientists Sought: Scientists doing laboratory experiments on Earth-surface and sedimentary processes. Scientists seeking to obtain data or workflow information about existing sediment experiments.
Contacts: Wonsuck Kim, Leslie Hsu

StraboSpot
 
Short Tool Description:
StraboSpot is a digital data system that allows researchers to collect and share geologic field and laboratory data, provide a context for samples, and create maps.  The system enables the user to link map-, meso- and microscale data and document space-time relations.
Tool category: Field and laboratory geological data aggregation and discovery
Readiness: Structural Geology mobile app (5); Sedimentary Geology and Petrology mobile app (4); Desktop app for experimental and microstructural data (2), with links to geochemical data (2)
Scientists Sought:  Field and laboratory-based geologists
Contact:  Doug Walker

 

SuAVE

Short Tool Description:

SuAVE (Survey Analysis via Visual Explorations) lets you publish and explore image collections and surveys online: slicing and dicing data on multiple dimensions, navigating data using faceted browsing, collaboratively analyzing datasets, and sharing findings via annotations over distribution patterns or individual collection items.

Tool category: Data Visualization and Analysis

Readiness: (5) but lots of additional functionality requests

Scientists Sought: Researchers in any geoscience domain looking to analyze and share their surveys and image collections, such as physical samples, specimen collections, or soil samples.

Contact: Ilya Zaslavsky

Links: Slides  Video

X-DOMES Ontology Registry

 

Short Tool Description:

Enables the creation of resolvable links to term definitions so that your terms can be mapped to others across-domains and agencies

Tool category: Vocabulary Creation and Registry

Readiness: (4) Issues are small, with the biggest being persistence!

Scientists Sought: Data providers and consumers seeking to develop cross-domain ontologies.

Contact: Janet Fredericks

Links: Slides  Video

X-DOMES SensorML Registry

Short Tool Description:

Enables the creation of SensorML documents (machine harvestable descriptions of how an observation came to be) with links to terms (X-DOMES Ontology Registry)

Tool category: Sensor Descriptions and  Registry

Readiness: (4) xdomes.org/orr – register sensor related terms. (4) cor.esipfed.org – register any terms. (3) xdomes.org/srr – SensorML Registry. (3) SensorML-V/E – SensorML viewer/editor

Scientists Sought: Data providers and sensor manufacturers

Contact: Janet Fredericks

Links: Slides  Video