Algorithms for clustering data
Algorithms for clustering data
Recent trends in hierarchic document clustering: a critical review
Information Processing and Management: an International Journal
Self-organization and associative memory: 3rd edition
Self-organization and associative memory: 3rd edition
Using latent semantic indexing for information filtering
COCS '90 Proceedings of the ACM SIGOIS and IEEE CS TC-OA conference on Office information systems
WordNet: a lexical database for English
Communications of the ACM
An algorithm for suffix stripping
Readings in information retrieval
Web document clustering: a feasibility demonstration
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Fast and effective text mining using linear-time document clustering
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Grouper: a dynamic clustering interface to Web search results
WWW '99 Proceedings of the eighth international conference on World Wide Web
A vector space model for automatic indexing
Communications of the ACM
Information Retrieval
Document clustering with committees
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Building Hypertext Links By Computing Semantic Similarity
IEEE Transactions on Knowledge and Data Engineering
Ontologies Improve Text Document Clustering
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Lexical cohesion computed by thesaural relations as an indicator of the structure of text
Computational Linguistics
Partial parsing via finite-state cascades
Natural Language Engineering
Hybrid Neural Document Clustering Using Guided Self-Organization and WordNet
IEEE Intelligent Systems
Constraint grammar as a framework for parsing running text
COLING '90 Proceedings of the 13th conference on Computational linguistics - Volume 3
A reference ontology for biomedical informatics: the foundational model of anatomy
Journal of Biomedical Informatics - Special issue: Unified medical language system
Document Similarity Using a Phrase Indexing Graph Model
Knowledge and Information Systems
Using Ontology in Hierarchical Information Clustering
HICSS '05 Proceedings of the Proceedings of the 38th Annual Hawaii International Conference on System Sciences (HICSS'05) - Track 4 - Volume 04
Exploiting concept clusters for content-based information retrieval
Information Sciences—Informatics and Computer Science: An International Journal
Information Processing and Management: an International Journal - Special issue: Cross-language information retrieval
A Concept-Driven Algorithm for Clustering Search Results
IEEE Intelligent Systems
Accurate unlexicalized parsing
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Gene-Ontology-based clustering of gene expression data
Bioinformatics
Neural Network Based Document Clustering Using WordNet Ontologies
International Journal of Hybrid Intelligent Systems
Ontology Based Clustering for Improving Genomic IR
CBMS '07 Proceedings of the Twentieth IEEE International Symposium on Computer-Based Medical Systems
A novel document similarity measure based on earth mover's distance
Information Sciences: an International Journal
GAKREM: A novel hybrid clustering algorithm
Information Sciences: an International Journal
Clustering high dimensional data: A graph-based relaxed optimization approach
Information Sciences: an International Journal
Information Sciences: an International Journal
WordNet-based text document clustering
ROMAND '04 Proceedings of the 3rd Workshop on RObust Methods in Analysis of Natural Language Data
An ontology-based two-level clustering for supporting e-commerce agents' activities
EC-Web'05 Proceedings of the 6th international conference on E-Commerce and Web Technologies
Ontology-based users and requests clustering in customer service management system
AIS-ADM 2005 Proceedings of the 2005 international conference on Autonomous Intelligent Systems: agents and Data Mining
Phrase clustering without document context
ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval
An alternative approach to tagging
NLDB'07 Proceedings of the 12th international conference on Applications of Natural Language to Information Systems
Information Sciences: an International Journal
WisColl: Collective wisdom based blog clustering
Information Sciences: an International Journal
GOClonto: An ontological clustering approach for conceptualizing PubMed abstracts
Journal of Biomedical Informatics
Document similarity: a new measure using OWA
FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 7
A short text modeling method combining semantic and statistical information
Information Sciences: an International Journal
Validation of overlapping clustering: A random clustering perspective
Information Sciences: an International Journal
User comments for news recommendation in forum-based social media
Information Sciences: an International Journal
Ensemble of feature sets and classification algorithms for sentiment classification
Information Sciences: an International Journal
Concept-based learning of human behavior for customer relationship management
Information Sciences: an International Journal
Enhanced clustering of biomedical documents using ensemble non-negative matrix factorization
Information Sciences: an International Journal
A time-varying propagation model of hot topic on BBS sites and Blog networks
Information Sciences: an International Journal
Exploring barriers to knowledge flow at different knowledge management maturity stages
Information and Management
Information Sciences: an International Journal
Concept chaining utilizing meronyms in text characterization
Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries
An ontology-based approach to Chinese semantic advertising
Information Sciences: an International Journal
Text Semantic Mining Model Based on the Algebra of Human Concept Learning
International Journal of Cognitive Informatics and Natural Intelligence
SEAL'12 Proceedings of the 9th international conference on Simulated Evolution and Learning
Semantic smoothing for text clustering
Knowledge-Based Systems
Hi-index | 0.07 |
Text document clustering plays an important role in providing better document retrieval, document browsing, and text mining. Traditionally, clustering techniques do not consider the semantic relationships between words, such as synonymy and hypernymy. To exploit semantic relationships, ontologies such as WordNet have been used to improve clustering results. However, WordNet-based clustering methods mostly rely on single-term analysis of text; they do not perform any phrase-based analysis. In addition, these methods utilize synonymy to identify concepts and only explore hypernymy to calculate concept frequencies, without considering other semantic relationships such as hyponymy. To address these issues, we combine detection of noun phrases with the use of WordNet as background knowledge to explore better ways of representing documents semantically for clustering. First, based on noun phrases as well as single-term analysis, we exploit different document representation methods to analyze the effectiveness of hypernymy, hyponymy, holonymy, and meronymy. Second, we choose the most effective method and compare it with the WordNet-based clustering method proposed by others. The experimental results show the effectiveness of semantic relationships for clustering are (from highest to lowest): hypernymy, hyponymy, meronymy, and holonymy. Moreover, we found that noun phrase analysis improves the WordNet-based clustering method.