Multimodal knowledge-based analysis in multimedia event detection
Proceedings of the 2nd ACM International Conference on Multimedia Retrieval
Proceedings of the 19th Brazilian symposium on Multimedia and the web
Hi-index | 0.00 |
Document clustering algorithms usually use vector space model (VSM) as their underlying model for document representation. VSM assumes that terms are independent and accordingly ignores any semantic relations between them. This results in mapping documents to a space where the proximity between document vectors does not reflect their true semantic similarity. In this paper, we propose the use of semantic kernels that are based on term-term correlations for improving the effectiveness of document clustering algorithms. The used kernels measure proximity between documents based on how their terms are statistically correlated. We analyze semantic kernels that capture different aspects of correlations between terms, and evaluate them by conducting experiments on different benchmark data sets. Results show that the proposed method achieves significant improvement in document clustering compared to VSM.