ETTA-IM: A deep web query interface matching approach based on evidence theory and task assignment

Authors:
Yongquan Dong;Qingzhong Li;Yanhui Ding;Zhaohui Peng
Affiliations:
School of Computer Science and Technology, Shandong University, Jinan 250101, China and School of Computer Science and Technology, Xuzhou Normal University, Xuzhou 221000, China;School of Computer Science and Technology, Shandong University, Jinan 250101, China;School of Computer Science and Technology, Shandong University, Jinan 250101, China;School of Computer Science and Technology, Shandong University, Jinan 250101, China
Venue:
Expert Systems with Applications: An International Journal
Year:
2011

Citing 15
Cited 1

Approximate String Matching

ACM Computing Surveys (CSUR)
Reconciling schemas of disparate data sources: a machine-learning approach

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Generic Schema Matching with Cupid

Proceedings of the 27th International Conference on Very Large Data Bases
A survey of approaches to automatic schema matching

The VLDB Journal — The International Journal on Very Large Data Bases
Statistical schema matching across web query interfaces

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Similarity Flooding: A Versatile Graph Matching Algorithm and Its Application to Schema Matching

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
An interactive clustering-based approach to integrating source query interfaces on the deep Web

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
iMAP: discovering complex semantic matches between database schemas

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Discovering complex matchings across web query interfaces: a correlation mining approach

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Combining belief functions based on distance of evidence

Decision Support Systems
Merging Interface Schemas on the Deep Web via Clustering Aggregation

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Accessing the web: from search to integration

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
COMA: a system for flexible combination of schema matching approaches

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Wise-integrator: an automatic integrator of web search interfaces for E-commerce

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Schema Matching across Query Interfaces on the Deep Web

BNCOD '08 Proceedings of the 25th British national conference on Databases: Sharing Data, Information and Knowledge

A study on using two-phase conditional random fields for query interface segmentation

WISM'11 Proceedings of the 2011 international conference on Web information systems and mining - Volume Part II

Quantified Score

Hi-index	12.05

Visualization

Abstract

Integrating Deep Web data sources require highly accurate matches between the attributes of the query interfaces. While interface matching has received more attentions recently, current approaches are still not sufficiently perfect: (a) they all suppose that every interface attribute type has been predefined; (b) most of them combine multiple matchers taking into account different aspects of information about schema, but the weights of individual matchers are usually manually generated, and there may exist a high degree of inconsistency among different matchers; and (c) most of them only consider one-to-one matches of attributes over the interfaces and lack effective mathematical modeling. Therefore, a novel deep web query interface matching approach called ETTA-IM is proposed based on evidence theory and task assignment. Varied kinds of type recognizers are defined to identify the types of interface attributes which are used to divide the schema space into several schema subspaces. A modified D-S evidence theory is used to automatically combine multiple matchers and to solve high conflicts among different matchers. One-to-one match decision is converted to extended task assignment problem and some tree structure heuristic rules are used to perform one-to-many match decision. Experiments show that ETTA-IM approach yields high precision and recall measures.