Assigning Web News to Clusters

Authors:
Christos Bouras;Vassilis Tsogkas
Affiliations:
-;-
Venue:
ICIW '10 Proceedings of the 2010 Fifth International Conference on Internet and Web Applications and Services
Year:
2010

Citing 0
Cited 1

i-JEN: visual interactive Malaysia crime news retrieval system

IVIC'11 Proceedings of the Second international conference on Visual informatics: sustaining research and innovations - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

The Web is overcrowded with news articles, an overwhelming information source both with its amount and diversity. Assigning news articles to similar groups, on the other hand, provides a very powerful data mining and manipulation technique for topic discovery from text documents. In this paper, we are investigating the application of a great spectrum of clustering algorithms, as well as similarity measures, to news articles that originate from the Web and compare their efficiency for use in an online Web news service application. We also examine the effect of preprocessing on clustering. Our experimentation showed that k-means, despite its simplicity, accompanied with preliminary steps for data cleaning and normalizing, gives better aggregate results when it comes to efficiency.