Representing Gridded and Space-time Data in HydroShare

Tian Gan, David G. Tarboton, Jeffery S. Horsburgh Civil and Environmental Engineering Department, Utah State University Introduction
CUAHSI HydroShare is being built as a web system for hydrologic data and model sharing. It will expand the data sharing capability of the CUAHSI Hydrologic Information System (HIS) by broadening the classes of data and taking advantages of social media functionality to enhance the collaboration around hydrologic research. One aspect of the HydroShare development is to design the information model to capture the science metadata information for each data class. It will be used for cataloging purposes, facilitating data reuse and supporting some system front end functionality. We present the design of the information model for the gridded and space-time data classes which include time series, geographic raster and multidimensional space time arrays, and story board the proposed system functionality for creating, visualizing, sharing and publishing the three data classes to offer a suggested vision for how gridded and space-time data may be represented in HydroShare.
Resource Table contains all the data uploaded by or shared with the user in HydroShare system. “Create” button is the start of creating a resource in the system. The second step is to upload data files. User can drag files or folders directly to this interface or browse to select files. When data files are uploaded, system will automatically populate some of the science metadata attributes by referencing system information or by extracting the information from data files. System will create a resource as long as a data file is uploaded into the system. And user can complete the science metadata, add or delete data files anytime later.

OCI-1148453 OCI-1148090

Create Data

Conclusions
We present the aspects of the HydroShare of information model and system functionality for data creation, visualization, sharing and publication, so as to provide a suggested way representing gridded and space-time data in the HydroShare system. The information model is important, as it serves as a foundation for the representation of content in HydroShare and can help to better support reuse of the uploaded data and some of the system front end functionality. For the information model design, we propose to separate the science metadata attributes in two types which include the generic metadata applied for all the data classes and the class specific metadata to capture the class specific features. The proposed system functionality aim to provide user with useful information about the data and the ability to interact with other data available in the system. In the future, we will refine the information model and try to implement some suggested system functionality. Besides, we will design the social metadata and the system metadata attributes to better manage the uploaded data and the involved social information in HydroShare system.

Information Model
In the HydroShare information model, we separate the science metadata attributes into two types: generic metadata and class specific metadata. The generic metadata attributes are applied for all the data classes, and we propose to adopt Dublin Core Metadata Elements with additional sub elements to better elaborate them (Table 1). The class specific metadata attributes are to capture the class specific features. We examined the features of each data class and referred to some existing metadata designs for these data classes to select the attributes of the information model (Table 2 to Table 4). The science metadata attributes were selected to support the following purposes: • Providing information to facilitate data reuse • Cataloguing to support data discovery using the system search functionality • Visualization functionality
Table 1 Generic Metadata for All Data Types
Attributes
Creator name email organization mail address phone Contributor name contribution email organization mail address phone Title Subject

Table 2 Key Class Specific Metadata for Time Series
Attributes
Variable Name Variable Units

Definition

Name of the variable of the time series data. Name of the variable units of the time series data. Value type of the time series data. e.g. “Field Value Type Observation” Data Type Data type of the time series data. e.g. “ Instantaneous” No Data Value The numeric data used for representing no data condition. Method Method used to get the time series data. Value Count Number of values from the time series data. Data Quality Data quality of the time series data. e.g. “Raw Data” Medium for collecting the time series data. e.g. “Water Sample Medium Surface”. Site Name Name of the site for collecting the time series data. Site Code Code of the site for collecting the time series data. Numerical value and corresponding units that indicates Time Support the time support (or temporal footprint) of the data values. If time series is instantaneous data, it will be 0. **Full metadata details are under development drawing upon content of ODM/WaterML

Visualize Data

Table3 Key Class Specific Metadata for Geographic Raster
Attributes
Row Number Column Number Cell Size Value

Definition

Description
Information about the person who uploaded the data created the HydroShare resource. Name of creator. Email of creator. Organization of creator. Mail address of the creator. Phone of creator. Information about a person who contributed to the data. Name of contributor. Description of the contribution the person has made. Email of contributor. Organization of contributor. Mail address of the contributor. Phone of contributor. A name given to the resource. The keywords for the resource. A point of time associated with an event in the lifecycle of the resource. Date when the resource is created. Date when the resource is last modified before it is published. Date when this resource is published.

Attributes
Description Identifier internal ID external ID Type Format Language Publisher Source name identifier Relation name identifier Rights

Description
Abstract of the resource. This is the system assigned Identifier for the corresponding resource. Internal identifier of the resource. Assigned when the resource is first created. DOI of the resource. Only be assigned when the resource is published. HydroShare system supported resource types. HydroShare system supported file format for a certain resource type. Major language used in the resource . HydroShare will use the threeletter language codes defined by ISO 639-2. The URL of the homepage for HydroShare system. Other resources in or outside of the HydroShare system that derives this resource. Name of the source. Identifier of the source. Other resources in or outside of the HydroShare system that is referenced by this resource. Name of the referenced resource. Identifier of the referenced resource. Information about rights held in and over the resource. HydroShare will encode a text statement or URL that points to a rights management or usage statement for the resource in the rights element. Temporal and spatial information related to the resource. Spatial information related to the resource. HydroShare system will specify it using DCMI Point or Box. Temporal information related to the resource. HydroShare system will specify it using DCMI Period. Citation information of the resource. (Not Dublin Core Elements)

Total row number of the raster datasets. Total column number of the raster datasets. Numerical value for the size of each pixel in the raster datasets. Cell Size Units Units for the size of each pixel in the raster datasets. Cell value type The data type for each pixel in the raster datasets. e.g. “Floating Point”. Band The band information in the raster datasets. band name The name of the band. associated variable The variable name of the value in the pixel of the band. variable units The variable units of the value in the pixel of the band. method The method used to get the value in the band. comments Free text comments for the band. Any information useful for data reuse. **Full metadata details are under development drawing upon information from systems like GDAL and GeoTIFF, etc.

In “Resource Detail / General Info” page, click on “Visualize” to see graphics of data. This page also shows some generic metadata information, the comments of the data and the editing history.

Time series visualization will show some class specific metadata and the site location in the map. In the visualization tool, it has the function to add other time series data listed in the Resource Table.

The Raster visualization tool will provide the view function similar as ArcGIS. Besides, it can also add the feature class and the time series which are listed in the Resource Table. The time series will be represented as point.

The NetCDF will also be visualized in the same tool for raster data. It can generate the animation of the data changing with time. And user can use the “Identify” tool to visualize the time series for a selected cell.

Share and Publish Data

Further Information
Currently, we set up the HydroShare Beta Website. If interested, please visit: http://beta.hydroshare.org/ If any suggestions for design of information model or the system functionality, please email to: David G. Tarboton: dtarb@usu.edu Jeffery S. Horsburgh: jeff.horsburgh@usu.edu Tian Gan: gantian127@gmail.com

Table4 Key Class Specific Metadata for Multidimensional Space-Time Arrays
Attributes
Convention

Definition

Date date created date last modified date published

Coverage spatial temporal Citation

The conventions used for the NetCDF file. Variable stored in the NetCDF which doesn’t represent Variable temporal and spatial coordinates. Variable name defined in the NetCDF. This name may defined name not explicitly explain what variable it actually represents. Full name of the variable. Explicitly explains the full name meaning of the variable. units Units of the variable. Name of the data types. This has predefined vocabulary data type which is from the data types of NetCDF enhanced data model. dimensions The defined variable shape in the NetCDF. Description of the method how the value of the variable method is obtained. The numeric data used for representing no data missing value condition. associated group The name of the group which the variable belongs to. **Full metadata details are under development drawing upon information from NetCDF with CF conventions.

In “Resource Detail” page, click on “Setting” to share or publish the uploaded data.

User can set the access policy for data and metadata separately.

User can set “Share List” to assign the other users to share the data with.

Resources can be submitted for formal publication which upon acceptance will result in a DOI being assigned and the resource becoming immutable with open metadata to public.

Sign up to vote on this title
UsefulNot useful