Pictures of relevance: a geometric analysis of similarity measures
Journal of the American Society for Information Science
Exploring the similarity space
ACM SIGIR Forum
A vector space model for automatic indexing
Communications of the ACM
Information Retrieval: Algorithms and Heuristics
Information Retrieval: Algorithms and Heuristics
Bayesian Data Mining on the Web with B-Course
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Text Mining for a Clear Picture of Defect Reports: A Praxis Report
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Mining data records in Web pages
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
Measures of distributional similarity
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Hi-index | 0.00 |
As electronic records (e.g., medical records and technical defect records) accumulate, the retrieval of a record from a past instance with the same or similar circumstances, has become extremely valuable. This is because a past record may contain the correct diagnosis or correct solution to the current circumstance. We refer to the two records of the same or similar circumstances as master and duplicate records. Current record retrieval techniques are lacking when applied to this special master defect record retrieval problem. In this study, we propose a new paradigm formaster defect record retrieval using network-based feature association (NBFA).We train themaster record retrieval process by constructing feature associations to limit the search space. The retrieval paradigm was employed and tested on a real-world large-scale defect record database from a telecommunications company. The empirical results suggest that the NBFA was able to significantly improve the performance ofmaster record retrieval, and should be implemented in practice. This paper presents an overview of technical aspects of the master defect record retrieval problem, describes general methodologies for retrieval of master defect records, proposes a new feature association paradigm, provides performance assessments on real data from a telecommunications company, and highlights difficulties and challenges in this line of research that should be addressed in the future.