Carrot2 and language properties in web search results clustering
AWIC'03 Proceedings of the 1st international Atlantic web intelligence conference on Advances in web intelligence
Hi-index | 0.00 |
Machine classification of Polish language emails into user-specific folders is considered. We experimentally evaluate the impact of different approaches to construct data representation of emails on the accuracy of classifiers. Our results show that language processing techniques have smaller influence than an appropriate selection of features, in particular ones coming from the email header or its attachments.