Enabling ad hoc queries over low-level scientific data sets

David Chiu, Gagan Agrawal

Research output: Chapter in Book/Report/Conference proceedingConference contribution

9 Scopus citations


Technological success has ushered in massive amounts of data for scientific analysis. To enable effective utilization of these data sets for all classes of users, supporting intuitive data access and manipulation interfaces is crucial. This paper describes an autonomous scientific workflow system that enables high-level, natural language based, queries over low-level data sets. Our technique involves a combination of natural language processing, metadata indexing, and a semantically-aware workflow composition engine which dynamically constructs workflows for answering queries based on service and data availability. A specific contribution of this work is a metadata registration scheme that allows for a unified index of heterogeneous metadata formats and service annotations. Our approach thus avoids a standardized format for storing all data sets or the implementation of a federated, mediator-based, querying framework. We have evaluated our system using a case study from the geospatial domain to show functional results. Our evaluation supports the potential benefits which our approach can offer to scientific workflow systems and other domain-specific, data intensive applications.

Original languageEnglish (US)
Title of host publicationScientific and Statistical Database Management - 21st International Conference, SSDBM 2009, Proceedings
Number of pages19
StatePublished - 2009
Externally publishedYes
Event21st International Conference on Scientific and Statistical Database Management, SSDBM 2009 - New Orleans, LA, United States
Duration: Jun 2 2009Jun 4 2009

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5566 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference21st International Conference on Scientific and Statistical Database Management, SSDBM 2009
Country/TerritoryUnited States
CityNew Orleans, LA

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)


Dive into the research topics of 'Enabling ad hoc queries over low-level scientific data sets'. Together they form a unique fingerprint.

Cite this