VAGUE: a user interface to relational databases that permits vague queries
ACM Transactions on Information Systems (TOIS)
Proceedings of the third annual conference on Autonomous Agents
Visual information retrieval
Integrating keyword search into XML query processing
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Data integration using similarity joins and a word-based information representation language
ACM Transactions on Information Systems (TOIS)
Principles of visual information retrieval
Learning object identification rules for information integration
Information Systems - Data extraction, cleaning and reconciliation
Modern Information Retrieval
Approximate String Joins in a Database (Almost) for Free
Proceedings of the 27th International Conference on Very Large Data Bases
Learning domain-independent string transformation weights for high accuracy object identification
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Text joins in an RDBMS for web data integration
WWW '03 Proceedings of the 12th international conference on World Wide Web
Searching XML documents via XML fragments
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
XML retrieval: what to retrieve?
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Querying structured text in an XML database
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Robust and efficient fuzzy match for online data cleaning
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Finding similar identities among objects from multiple web sources
WIDM '03 Proceedings of the 5th ACM international workshop on Web information and data management
Proceedings of the 2007 ACM symposium on Document engineering
An approach to XML path matching
Proceedings of the 9th annual ACM international workshop on Web information and data management
A strategy for allowing meaningful and comparable scores in approximate matching
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
SimEval: a tool for evaluating the quality of similarity functions
ER '07 Tutorials, posters, panels and industrial contributions at the 26th international conference on Conceptual modeling - Volume 83
Uma abordagem efetiva e eficiente para deduplicação de metadados bibliográficos de objetos digitais
SBBD '08 Proceedings of the 23rd Brazilian symposium on Databases
A strategy for allowing meaningful and comparable scores in approximate matching
Information Systems
A strategy for allowing meaningful and comparable scores in approximate matching
Information Systems
XML data clustering: An overview
ACM Computing Surveys (CSUR)
An unsupervised heuristic-based approach for bibliographic metadata deduplication
Information Processing and Management: an International Journal
Estimating recall and precision for vague queries in databases
CAiSE'05 Proceedings of the 17th international conference on Advanced Information Systems Engineering
Information retrieval of sequential data in heterogeneous XML databases
AMR'05 Proceedings of the Third international conference on Adaptive Multimedia Retrieval: user, context, and feedback
Survey: An overview on XML similarity: Background, current trends and future directions
Computer Science Review
Hi-index | 0.00 |
In this paper, we propose a set of similarity metrics for manipulating collections of values occuring in XML documents. Following the data model presented in TAX algebra, we treat an XML element as a labeled ordered rooted tree. Consider that XML nodes can be either atomic, i.e, they may contain single values such as short character strings, date, etc, or complex, i.e., nested structures that contain other nodes, we propose two types of similarity metrics: MAVs, for atomic nodes and MCVs, for complex nodes. In the first case, we suggest the use of several application domain dependent metrics. In the second case, we define metrics for complex values that are structure dependent, and can be distinctly applied for it and collections of values. We also present experiments showing the effectiveness of our method.