TnT: a statistical part-of-speech tagger
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Hierarchical Clustering Algorithms for Document Datasets
Data Mining and Knowledge Discovery
Application of stacked methods to part-of-speech tagging of polish
PPAM'09 Proceedings of the 8th international conference on Parallel processing and applied mathematics: Part I
Hi-index | 0.00 |
The document clustering is an important technique of Natural Language Processing (NLP). The paper presents performance of partitional and agglomerative algorithms applied to clustering large number of Polish newspaper articles. We investigate different representations of the documents. The focus of the paper is on the applicability of the Latent Semantic Analysis to such clustering for Polish.