Exploiting Wikipedia as external knowledge for document clustering
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Effective multi-label active learning for text classification
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
The impact of document structure on keyphrase extraction
Proceedings of the 18th ACM conference on Information and knowledge management
Unsupervised feature selection for multi-cluster data
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Document clustering via dirichlet process mixture model with feature selection
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Frequent itemset based hierarchical document clustering using Wikipedia as external knowledge
KES'10 Proceedings of the 14th international conference on Knowledge-based and intelligent information and engineering systems: Part II
A Personalized Ontology Model for Web Information Gathering
IEEE Transactions on Knowledge and Data Engineering
High-precision phrase-based document classification on a modern scale
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Automated feature generation from structured knowledge
Proceedings of the 20th ACM international conference on Information and knowledge management
Hi-index | 0.00 |
Finding and labelling semantic features patterns of documents in a large, spatial corpus is a challenging problem. Text documents have characteristics that make semantic labelling difficult, the rapidly increasing volume of online documents makes a bottleneck in finding meaningful textual patterns. Aiming to deal with these issues, we propose an unsupervised documnent labelling approach based on semantic content and feature patterns. A world ontology with extensive topic coverage is exploited to supply controlled, structured subjects for labelling. An algorithm is also introduced to reduce dimensionality based on the study of ontological structure. The proposed approach was promisingly evaluated by compared with typical machine learning methods including SVMs, Rocchio, and kNN.