A tolerance rough set approach to clustering web search results
PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
Outlier-robust clustering using independent components
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
An ontology-driven approach for semantic information retrieval on the Web
ACM Transactions on Internet Technology (TOIT)
Computing semantic relatedness using Wikipedia-based explicit semantic analysis
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Using evolution programs to learn local similarity measures
ICCBR'03 Proceedings of the 5th international conference on Case-based reasoning: Research and Development
Applications of approximate reducts to the feature selection problem
RSKT'11 Proceedings of the 6th international conference on Rough sets and knowledge technology
Clustering of rough set related documents with use of knowledge from DBpedia
RSKT'11 Proceedings of the 6th international conference on Rough sets and knowledge technology
Formal Concept Analysis: foundations and applications
Formal Concept Analysis: foundations and applications
Ensembles of bireducts: towards robust classification and simple representation
FGIT'11 Proceedings of the Third international conference on Future Generation Information Technology
Dynamic rule-based similarity model for DNA microarray data
Transactions on Rough Sets XV
Calculi of Approximation Spaces
Fundamenta Informaticae - SPECIAL ISSUE ON CONCURRENCY SPECIFICATION AND PROGRAMMING (CS&P 2005) Ruciane-Nide, Poland, 28-30 September 2005
Fundamenta Informaticae
Rough Sets, Rough Relations And Rough Functions
Fundamenta Informaticae
Tolerance Approximation Spaces
Fundamenta Informaticae
Semantic clustering of scientific articles using explicit semantic analysis
Transactions on Rough Sets XVI
Hi-index | 0.00 |
This paper presents a research on the construction of a new unsupervised model for learning a semantic similarity measure from text corpora. Two main components of the model are a semantic interpreter of texts and a similarity function whose properties are derived from data. The first one associates particular documents with concepts defined in a knowledge base corresponding to the topics covered by the corpus. It shifts the representation of a meaning of the texts from words that can be ambiguous to concepts with predefined semantics. With this new representation, the similarity function is derived from data using a modification of the dynamic rule-based similarity model, which is adjusted to the unsupervised case. The adjustment is based on a novel notion of an information bireduct having its origin in the theory of rough sets. This extension of classical information reducts is used in order to find diverse sets of reference documents described by diverse sets of reference concepts that determine different aspects of the similarity. The paper explains a general idea of the approach and also gives some implementation guidelines. Additionally, results of some preliminary experiments are presented in order to demonstrate usefulness of the proposed model.