Agglomerative clustering of symbolic objects using the concepts of both similarity and dissimilarity
Pattern Recognition Letters
Information storage and retrieval
Information storage and retrieval
A study of retrospective and on-line event detection
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
On-line new event detection and tracking
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A Robust Competitive Clustering Algorithm With Applications in Computer Vision
IEEE Transactions on Pattern Analysis and Machine Intelligence
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A new approach to unsupervised text summarization
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Text-Learning and Related Intelligent Agents: A Survey
IEEE Intelligent Systems
Centroid-Based Document Classification: Analysis and Experimental Results
PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
Similarity-based methods for word sense disambiguation
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Selecting sentences for multidocument summaries using randomized local search
AS '02 Proceedings of the ACL-02 Workshop on Automatic Summarization - Volume 4
NAACL-ANLP-AutoSum '00 Proceedings of the 2000 NAACL-ANLP Workshop on Automatic Summarization
Hi-index | 0.00 |
Web news classification is an unsupervised learning task, which is often accomplished by clustering methods. In traditional works, documents are first represented using the vector space model. Each vector generally consists of the keywords or phrases important to the document. Then vectors are clustered together according to some (dis)similarity measure. Such methods often take no or little semantic information into account. In this paper, we present a semantics-based event-driven approach. Event is represented by 3-tuple and document is associated with set of candidate events. These event sets are classified according to semantic dissimilarity. The preliminary experiment on Chinese web news classification shows that the proposed approach is promising.