TY - GEN
T1 - Answering cross-source keyword queries over deep web data sources
AU - Wang, Fan
AU - Agrawal, Gagan
PY - 2011
Y1 - 2011
N2 - A popular trend in data dissemination involves online data sources that are hidden behind query forms, which are part of the deep web. Extracting information across multiple deep web sources in a domain is challenging, but increasingly crucial in many areas. Keyword search, a popular information discovery method, has been studied extensively on the surface web and relational databases. Keyword-based queries can provide a powerful yet intuitive means for accessing data from the deep web as well. However, this involves many challenges. For example, deep web data is hidden behind query interfaces, deep web data sources often contain redundant and/or incomplete data, and there is often inter-dependence among data sources. Thus, it is very hard to automatically execute cross-source queries. This paper focuses on answering cross-source queries over deep web data sources. In our approach, we model a list of deep web data sources using a graph to capture the dependencies among them, and we consider the problem of answering cross-source queries over these deep web data sources as a graph search problem. We have developed a bidirectional query planning algorithm to generate query plans for two types of cross-source queries, which are entity-attributes queries and entity-entity relationship queries.
AB - A popular trend in data dissemination involves online data sources that are hidden behind query forms, which are part of the deep web. Extracting information across multiple deep web sources in a domain is challenging, but increasingly crucial in many areas. Keyword search, a popular information discovery method, has been studied extensively on the surface web and relational databases. Keyword-based queries can provide a powerful yet intuitive means for accessing data from the deep web as well. However, this involves many challenges. For example, deep web data is hidden behind query interfaces, deep web data sources often contain redundant and/or incomplete data, and there is often inter-dependence among data sources. Thus, it is very hard to automatically execute cross-source queries. This paper focuses on answering cross-source queries over deep web data sources. In our approach, we model a list of deep web data sources using a graph to capture the dependencies among them, and we consider the problem of answering cross-source queries over these deep web data sources as a graph search problem. We have developed a bidirectional query planning algorithm to generate query plans for two types of cross-source queries, which are entity-attributes queries and entity-entity relationship queries.
UR - http://www.scopus.com/inward/record.url?scp=80052007178&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=80052007178&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-22606-9_47
DO - 10.1007/978-3-642-22606-9_47
M3 - Conference contribution
AN - SCOPUS:80052007178
SN - 9783642226052
T3 - Communications in Computer and Information Science
SP - 475
EP - 490
BT - Contemporary Computing - 4th International Conference, IC3 2011, Proceedings
T2 - 4th International Conference on Contemporary Computing, IC3 2011
Y2 - 8 August 2011 through 10 August 2011
ER -