Concept decompositions for large sparse text data using clustering
Machine Learning
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Stemming and lemmatization in the clustering of finnish text documents
Proceedings of the thirteenth ACM international conference on Information and knowledge management
External validation measures for K-means clustering: A data distribution perspective
Expert Systems with Applications: An International Journal
Using text mining and sentiment analysis for online forums hotspot detection and forecast
Decision Support Systems
Cross-Language Information Retrieval
Cross-Language Information Retrieval
Word co-occurrence features for text classification
Information Systems
Mining significant words from customer opinions written in different natural languages
TSD'11 Proceedings of the 14th international conference on Text, speech and dialogue
Hi-index | 0.00 |
Having a very large volume of unstructured text documents representing different opinions without knowing which document belongs to a certain category, clustering can help reveal the classes. The presented research dealt with almost two millions of opinions concerning customers' (dis)satisfaction with hotel services all over the world. The experiments investigated the automatic building of clusters representing positive and negative opinions. For the given high-dimensional sparse data, the aim was to find a clustering algorithm with a set of its best parameters, similarity and clustering-criterion function, word representation, and the role of stemming. As the given data had the information of belonging to the positive or negative class at its disposal, it was possible to verify the efficiency of various algorithms and parameters. From the entropy viewpoint, the best results were obtained with k-means using the binary representation with the cosine similarity, idf, and H2 criterion function, while stemming played no role.