Using metadata for efficient loading and querying of heterogeneous scientific data
Benjamin River Leinfelder, Jing Tao, Duane Costa, Matthew Jones, Mark Servilla, Margaret O'Brien, Chad Burt
Last modified: 2008-08-21
The Ecological Metadata Language is an effective specification for describing data for long-term storage and interpretation. When used in conjunction with a metadata repository such as Metacat, and a metadata editing tool such as Morpho, the Ecological Metadata Language allows a large community of researchers to access and to share their data. Although the Ecological Metadata Language/Morpho/Metacat toolkit provides a rich and seamless data documentation mechanism, current methods for retrieving metadata-described data can be laborious and time consuming. Moreover, the structural and semantic heterogeneity of ecological data sets makes the development of custom solutions for querying them prohibitively costly. The Data Manager library leverages the Ecological Metadata Language to provide automated data processing features that allow efficient data access, querying, and manipulation without custom development. The library can be used for many data management tasks and was designed to be both extensible and easy to incorporate in existing data management applications. In this paper we describe the motivation for developing the Data Manger library, provide an overview of its implementation, illustrate ideas for potential use by describing several planned and existing deployments, and describe future work to extend the library.