A fuzzy document retrieval system using the keyword connection matrix and a learning method
Fuzzy Sets and Systems - Special issue on applications of fuzzy systems theory, Iizuka '88
Relevance weighting of search terms
Document retrieval systems
Theory of topological molecular lattices
Fuzzy Sets and Systems
Similarity measure between fuzzy sets and between elements
Fuzzy Sets and Systems
A comparison of similarity measures of fuzzy values
Fuzzy Sets and Systems
A vector space model for automatic indexing
Communications of the ACM
Vocabulary mining for information retrieval: rough sets and fuzzy sets
Information Processing and Management: an International Journal
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Pattern Recognition with Fuzzy Objective Function Algorithms
Pattern Recognition with Fuzzy Objective Function Algorithms
Building a Chinese-English wordnet for translingual applications
ACM Transactions on Asian Language Information Processing (TALIP)
A comparative study of fuzzy rough sets
Fuzzy Sets and Systems
Cross-Lingual Document Similarity Calculation Using the Multilingual Thesaurus EUROVOC
CICLing '02 Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing
Information Retrieval with Conceptual Graph Matching
DEXA '00 Proceedings of the 11th International Conference on Database and Expert Systems Applications
Classification of Web Documents Using a Graph Model
ICDAR '03 Proceedings of the Seventh International Conference on Document Analysis and Recognition - Volume 1
Efficient Phrase-Based Document Indexing for Web Document Clustering
IEEE Transactions on Knowledge and Data Engineering
Semantics-Preserving Dimensionality Reduction: Rough and Fuzzy-Rough-Based Approaches
IEEE Transactions on Knowledge and Data Engineering
The SMART information retrieval project
HLT '93 Proceedings of the workshop on Human Language Technology
Multilingual document clustering: an heuristic approach based on cognate named entities
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Automated ontology construction for unstructured text documents
Data & Knowledge Engineering
Graph-based text representation and knowledge discovery
Proceedings of the 2007 ACM symposium on Applied computing
A novel document similarity measure based on earth mover's distance
Information Sciences: an International Journal
A new approach on search for similar documents with multiple categories using fuzzy clustering
Expert Systems with Applications: An International Journal
Towards a unified approach to document similarity search using manifold-ranking of blocks
Information Processing and Management: an International Journal
Efficient Phrase-Based Document Similarity for Clustering
IEEE Transactions on Knowledge and Data Engineering
Enhancing multilingual latent semantic analysis with term alignment information
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Bilingual news clustering using named entities and fuzzy similarity
TSD'07 Proceedings of the 10th international conference on Text, speech and dialogue
A fuzzy ontology and its application to news summarization
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Robust fuzzy clustering of relational data
IEEE Transactions on Fuzzy Systems
A Novel Similarity-Based Fuzzy Clustering Algorithm by Integrating PCM and Mountain Method
IEEE Transactions on Fuzzy Systems
Extracting news blog hot topics based on the W2T Methodology
World Wide Web
Hi-index | 0.00 |
As cross-lingual information retrieval is attracting increasing attention, tools that measure cross-lingual semantic similarity between documents are becoming desirable. In this paper, two aspects of cross-lingual semantic document similarity measures are investigated: One is document representation, and the other is the formulation of similarity measures. Fuzzy set and rough set theories are applied to capture the inherently fuzzy relationships among concepts expressed by natural languages. Our approach first develops a language-independent sense-level document representation based on the fuzzy set model to reduce the barrier between different languages and further explores the fuzzy-rough hybrid approach to obtain a more robust macrosense-level document representation through the partitioning of the integrated sense association network of the document collection into macrosenses. Then, Tversky's notion of similarity and the F1 measure on information retrieval are adopted to formulate, respectively, two document similarity measures with fuzzy set operations on the two proposed document representations. The effectiveness of our approach is demonstrated by its success rate in identifying the English translations to their corresponding Chinese documents in a collection of Chinese-English parallel documents. Moreover, the proposed approach can be easily extended to process documents in other languages. It is believed that the proposed representations, along with the similarity measures, will enable more effective text mining processes.