Technological success has ushered in massive amounts of data for scientific analysis. To enable effective utilization of these data sets for all classes of users, supporting intuitive data access and manipulation interfaces is crucial. This paper describes an autonomous scientific workflow system that enables high-level, natural language based, queries over low-level data sets. Our technique involves a combination of natural language processing, metadata indexing, and a semantically-aware workflow composition engine which dynamically constructs workflows for answering queries based on service and data availability. A specific contribution of this work is a metadata registration scheme that allows for a unified index of heterogeneous metadata formats and service annotations. Our approach thus avoids a standardized format for storing all data sets or the implementation of a federated, mediator-based, querying framework. We have evaluated our system using a case study from the geospatial domain to show functional results. Our evaluation supports the potential benefits which our approach can offer to scientific workflow systems and other domain-specific, data intensive applications.