Instance discovery and schema matching with applications to biological deep web data integration

Tantan Liu, Fan Wang, Gagan Agrawal

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations

Abstract

We presents data mining-based techniques for enabling data integration across deep web data sources. We target query processing across inter-dependent data sources. Thus, besides input-input and output-output matching of attributes, we also need to consider input-output matching. We develop data mining techniques for discovering the instances for querying deep web data sources from the information provided by the query interfaces themselves, as well as from the obtained output pages of the related data sources, by query probing using dynamically identified input instances. Then, using a hierarchical representation of schemas and by applying clustering techniques, we are able to generate schema matches. We show the effectiveness of our technique while integrating 24 query interfaces.

Original languageEnglish (US)
Title of host publication10th IEEE International Conference on Bioinformatics and Bioengineering 2010, BIBE 2010
Pages304-305
Number of pages2
Volume6254
DOIs
StatePublished - 2010
Externally publishedYes
Event10th IEEE International Conference on Bioinformatics and Bioengineering, BIBE-2010 - Philadelphia, PA, United States
Duration: May 31 2010Jun 3 2010

Publication series

Name10th IEEE International Conference on Bioinformatics and Bioengineering 2010, BIBE 2010

Other

Other10th IEEE International Conference on Bioinformatics and Bioengineering, BIBE-2010
Country/TerritoryUnited States
CityPhiladelphia, PA
Period5/31/106/3/10

ASJC Scopus subject areas

  • Biomedical Engineering
  • Health Informatics

Fingerprint

Dive into the research topics of 'Instance discovery and schema matching with applications to biological deep web data integration'. Together they form a unique fingerprint.

Cite this