Discovering word senses from text using random indexing

Authors:
Niladri Chatterjee;Shiwali Mohan
Affiliations:
Department of Mathematics, Indian Institute of Technology Delhi, New Delhi, India;Yahoo! Research and Development India, Bangalore, India
Venue:
CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
Year:
2008

Citing 5
Cited 1

Scatter/Gather: a cluster-based approach to browsing large document collections

SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Sparse Distributed Memory

Sparse Distributed Memory
Chameleon: Hierarchical Clustering Using Dynamic Modeling

Computer
Discovering word senses from text

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Clustering by committee

Clustering by committee

Failure prediction based on log files using Random Indexing and Support Vector Machines

Journal of Systems and Software

Quantified Score

Hi-index	0.00

Visualization

Abstract

Random Indexing is a novel technique for dimensionality reduction while creating Word Space model from a given text. This paper explores the possible application of Random Indexing in discovering word senses from the text. The words appearing in the text are plotted onto a multi-dimensional Word Space using Random Indexing. The geometric distance between words is used as an indicative of their semantic similarity. Soft Clustering by Committee algorithm (CBC) has been used to constellate similar words. The present work shows that the Word Space model can be used effectively to determine the similarity index required for clustering. The approach does not require parsers, lexicons or any other resources which are traditionally used in sense disambiguation of words. The proposed approach has been applied to TASA corpus and encouraging results have been obtained.