The TSIMMIS Approach to Mediation: Data Models and Languages
Journal of Intelligent Information Systems - Special issue: next generation information technologies and systems
On-line new event detection and tracking
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Modeling Web sources for information integration
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
An adaptive query execution system for data integration
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Activity monitoring: noticing interesting changes in behavior
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
NiagaraCQ: a scalable continuous query system for Internet databases
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Reconciling schemas of disparate data sources: a machine-learning approach
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Automatic repairing of web wrappers
Proceedings of the 3rd international workshop on Web information and data management
World Wide Web
Optimizing Queries Across Diverse Data Sources
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Querying Heterogeneous Information Sources Using Source Descriptions
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
A survey of approaches to automatic schema matching
The VLDB Journal — The International Journal on Very Large Data Bases
On schema matching with opaque column names and data values
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Statistical schema matching across web query interfaces
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Wrapper induction for information extraction
Wrapper induction for information extraction
Schema-guided wrapper maintenance for web-data extraction
WIDM '03 Proceedings of the 5th ACM international workshop on Web information and data management
Web-scale information extraction in knowitall: (preliminary results)
Proceedings of the 13th international conference on World Wide Web
iMAP: discovering complex semantic matches between database schemas
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Mapping adaptation under evolving schemas
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Wrapper maintenance: a machine learning approach
Journal of Artificial Intelligence Research
Tuning schema matching software using synthetic scenarios
VLDB '05 Proceedings of the 31st international conference on Very large data bases
An efficient algorithm for XML type projection
Proceedings of the 8th ACM SIGPLAN international conference on Principles and practice of declarative programming
Putting context into schema matching
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
eTuner: tuning schema matching software using synthetic scenarios
The VLDB Journal — The International Journal on Very Large Data Bases
A bottom-up algorithm for query decomposition
International Journal of Innovative Computing and Applications
Maintaining Semantic Mappings between Database Schemas and Ontologies
Semantic Web, Ontologies and Databases
Round-Trip Engineering for Maintaining Conceptual-Relational Mappings
CAiSE '08 Proceedings of the 20th international conference on Advanced Information Systems Engineering
GS-TMS: a global stream-based threat monitor system
Proceedings of the VLDB Endowment
Fault-tolerant semantic mappings among heterogeneous and distributed local ontologies
Proceedings of the 2nd international workshop on Ontologies and information systems for the semantic web
Reconciliando dados de cunho acadêmico
SBBD '08 Proceedings of the 23rd Brazilian symposium on Databases
Robust web extraction: an approach based on a probabilistic tree-edit model
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Defining and Using Schematic Correspondences for Automatically Generating Schema Mappings
CAiSE '09 Proceedings of the 21st International Conference on Advanced Information Systems Engineering
Detection of corrupted schema mappings in XML data integration systems
ACM Transactions on Internet Technology (TOIT)
Feedback-based annotation, selection and refinement of schema mappings for dataspaces
Proceedings of the 13th International Conference on Extending Database Technology
A reference model for semantic peer-to-peer networks
Journal on data semantics XV
Querying e-catalogs using content summaries
ODBASE'06/OTM'06 Proceedings of the 2006 Confederated international conference on On the Move to Meaningful Internet Systems: CoopIS, DOA, GADA, and ODBASE - Volume Part I
Rewriting queries for XML integration systems
DEXA'06 Proceedings of the 17th international conference on Database and Expert Systems Applications
Change management in large-scale enterprise information systems
EDBT'06 Proceedings of the 2006 international conference on Current Trends in Database Technology
Maintaining Mappings between Conceptual Models and Relational Schemas
Journal of Database Management
Incrementally improving dataspaces based on user feedback
Information Systems
Hi-index | 0.00 |
To answer user queries, a data integration system employs a set of semantic mappings between the mediated schema and the schemas of data sources. In dynamic environments sources often undergo changes that invalidate the mappings. Hence, once the system is deployed, the administrator must monitor it over time, to detect and repair broken mappings. Today such continuous monitoring is extremely labor intensive, and poses a key bottleneck to the widespread deployment of data integration systems in practice.We describe MAVERIC, an automatic solution to detecting broken mappings. At the heart of MAVERIC is a set of computationally inexpensive modules called sensors, which capture salient characteristics of data sources (e.g., value distributions, HTML layout properties). We describe how MAVERIC trains and deploys the sensors to detect broken mappings. Next we develop three novel improvements: perturbation (i.e., injecting artificial changes into the sources) and multi-source training to improve detection accuracy, and filtering to further reduce the number of false alarms. Experiments over 114 real-world sources in six domains demonstrate the effectiveness of our sensor-based approach over existing solutions, as well as the utility of our improvements.