TopCat: Data Mining for Topic Identification in a Text Corpus
PKDD '99 Proceedings of the Third European Conference on Principles of Data Mining and Knowledge Discovery
Hi-index | 0.00 |
Recent years have witnessed an explosion in the availability of news articles on the World Wide Web. Although search-engines' algorithms have made it easier to locate these documents, they still require considerable effort on the part of the user since most search engine algorithms look for keywords and do not take the contents of the entire article into context. We propose a system that clusters articles based on their topics. More specifically, we have focused on applying text mining methods to help solve the problems faced by a media organization or public relations department.