A comparative analysis of methodologies for database schema integration
ACM Computing Surveys (CSUR)
Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Beyond market baskets: generalizing association rules to correlations
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Probe, count, and classify: categorizing hidden web databases
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Reconciling schemas of disparate data sources: a machine-learning approach
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Alternative Interest Measures for Mining Associations in Databases
IEEE Transactions on Knowledge and Data Engineering
Generic Schema Matching with Cupid
Proceedings of the 27th International Conference on Very Large Data Bases
A survey of approaches to automatic schema matching
The VLDB Journal — The International Journal on Very Large Data Bases
Selecting the right interestingness measure for association patterns
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Statistical schema matching across web query interfaces
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Robust and efficient fuzzy match for online data cleaning
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
CoMine: Efficient Mining of Correlated Patterns
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Understanding Web query interfaces: best-effort parsing with hidden syntax
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Clustering structured web sources: a schema-based, model-differentiation approach
EDBT'04 Proceedings of the 2004 international conference on Current Trends in Database Technology
Organizing structured web sources by query schemas: a clustering approach
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Structured databases on the web: observations and implications
ACM SIGMOD Record
A holistic paradigm for large scale schema matching
ACM SIGMOD Record
Editorial: special issue on web content mining
ACM SIGKDD Explorations Newsletter
Mining semantics for large scale integration on the web: evidences, insights, and challenges
ACM SIGKDD Explorations Newsletter
Towards Building a MetaQuerier: Extracting and Matching Web Query Interfaces
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Object-level ranking: bringing order to Web objects
WWW '05 Proceedings of the 14th international conference on World Wide Web
MetaQuerier: querying structured web sources on-the-fly
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Light-weight domain-based form assistant: querying web databases on the fly
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Semantic-integration research in the database community
AI Magazine - Special issue on semantic integration
Discovering complex mapping expressions with the TUPELO data mapping system
Proceedings of the first international workshop on Interoperability of heterogeneous information systems
Automatic complex schema matching across Web query interfaces: A correlation mining approach
ACM Transactions on Database Systems (TODS)
A Survey of Web Information Extraction Systems
IEEE Transactions on Knowledge and Data Engineering
Meaningful labeling of integrated query interfaces
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
eTuner: tuning schema matching software using synthetic scenarios
The VLDB Journal — The International Journal on Very Large Data Bases
Towards a global schema for web entities
Proceedings of the 17th international conference on World Wide Web
PORSCHE: Performance ORiented SCHEma mediation
Information Systems
Association Mining in Large Databases: A Re-examination of Its Measures
PKDD 2007 Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases
Schema Matching across Query Interfaces on the Deep Web
BNCOD '08 Proceedings of the 25th British national conference on Databases: Sharing Data, Information and Knowledge
Efficient Top-k Data Sources Ranking for Query on Deep Web
WISE '08 Proceedings of the 9th international conference on Web Information Systems Engineering
Automatic Extraction of Structurally Coherent Mini-Taxonomies
ER '08 Proceedings of the 27th International Conference on Conceptual Modeling
A Prioritized Collective Selection Strategy for Schema Matching across Query Interfaces
BNCOD 26 Proceedings of the 26th British National Conference on Databases: Dataspace: The Final Frontier
Ontology based schema matching and mapping approach for structured databases
Proceedings of the 2nd International Conference on Interaction Sciences: Information Technology, Culture and Human
An evidential approach to query interface matching on the deep Web
Information Systems
A hierarchical approach to model web query interfaces for web source integration
Proceedings of the VLDB Endowment
Complex Schema Match Discovery and Validation through Collaboration
OTM '09 Proceedings of the Confederated International Conferences, CoopIS, DOA, IS, and ODBASE 2009 on On the Move to Meaningful Internet Systems: Part I
Wrapping of Web Sources with restricted Query Interfaces by Query Tunneling
Electronic Notes in Theoretical Computer Science (ENTCS)
Parsing query interfaces of deep web: from specialization to generalization
IITA'09 Proceedings of the 3rd international conference on Intelligent information technology application
Association pattern mining for product specification integration
FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 2
Re-examination of interestingness measures in pattern mining: a unified framework
Data Mining and Knowledge Discovery
A query interface matching approach based on extended evidence theory for deep web
Journal of Computer Science and Technology
ETTA-IM: A deep web query interface matching approach based on evidence theory and task assignment
Expert Systems with Applications: An International Journal
Attribute domain discovery for hidden web databases
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Measuring similarity of chinese web databases based on category hierarchy
APWeb'11 Proceedings of the 13th Asia-Pacific web conference on Web technologies and applications
Layout object model for extracting the schema of web query interfaces
APWeb'11 Proceedings of the 13th Asia-Pacific web conference on Web technologies and applications
Unsupervised transactional query classification based on webpage form understanding
Proceedings of the 20th ACM international conference on Information and knowledge management
Holistic schema matching for web query interfaces
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
A novel clustering-based approach to schema matching
ADVIS'06 Proceedings of the 4th international conference on Advances in Information Systems
ProFoUnd: program-analysis-based form understanding
Proceedings of the 21st international conference companion on World Wide Web
Optimal algorithms for crawling a hidden database in the web
Proceedings of the VLDB Endowment
Performance oriented schema matching
DEXA'07 Proceedings of the 18th international conference on Database and Expert Systems Applications
A context-based approach for the discovery of complex matches between database sources
DEXA'07 Proceedings of the 18th international conference on Database and Expert Systems Applications
Towards a More Scalable Schema Matching: A Novel Approach
International Journal of Distributed Systems and Technologies
Hi-index | 0.00 |
To enable information integration, schema matching is a critical step for discovering semantic correspondences of attributes across heterogeneous sources. While complex matchings are common, because of their far more complex search space, most existing techniques focus on simple 1:1 matchings. To tackle this challenge, this paper takes a conceptually novel approach by viewing schema matching as correlation mining, for our task of matching Web query interfaces to integrate the myriad databases on the Internet. On this "deep Web," query interfaces generally form complex matchings between attribute groups (e.g., [author] corresponds to [first name, last name] in the Books domain). We observe that the co-occurrences patterns across query interfaces often reveal such complex semantic relationships: grouping attributes (e.g., [first name, last name]) tend to be co-present in query interfaces and thus positively correlated. In contrast, synonym attributes are negatively correlated because they rarely co-occur. This insight enables us to discover complex matchings by a correlation mining approach. In particular, we develop the DCM framework, which consists of data preparation, dual mining of positive and negative correlations, and finally matching selection. Unlike previous correlation mining algorithms, which mainly focus on finding strong positive correlations, our algorithm cares both positive and negative correlations, especially the subtlety of negative correlations, due to its special importance in schema matching. This leads to the introduction of a new correlation measure, $H$-measure, distinct from those proposed in previous work. We evaluate our approach extensively and the results show good accuracy for discovering complex matchings.