Related terms clustering for enhancing the comprehensibility of web search results
DEXA'07 Proceedings of the 18th international conference on Database and Expert Systems Applications
Hi-index | 0.00 |
Huge document collections can be organized according to textual similarities by the Self-Organizing Map (SOM) algorithm, when statistical representations of the textual contents are used as the feature vectors of the documents. In a practical experiment, we mapped 6,840,568 patent abstracts onto a 1,002,240-node SOM. For the feature vectors, we selected 500-dimensional random projections of the weighted word histograms.