A class-feature-centroid classifier for text categorization
Proceedings of the 18th international conference on World wide web
Automatic genre detection of web documents
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Objectivity classification in online media
Proceedings of the 21st ACM conference on Hypertext and hypermedia
Hi-index | 0.00 |
Breaking news and events are often posted in the blogo-sphere before they are published by any media agency. Therefore, the blogosphere is a valuable resource for news-related blog analysis. However, it is crucial to first sort out news-unrelated content like personal diaries or advertising blogs. Besides, there are different levels of emotionality or involvement which bias the news information to a certain extent. In our work, we evaluate topic-independent stylometric features to classify blogs into news versus rest and to assess the emotionality in these blogs. We apply several text classifiers to determine the best performing combination of features and algorithms. Our experiments revealed that with simple style features, blogs can be classified into news versus rest and their emotionality can be assessed with accuracy values of almost 80%.