Finding aliases on the web using latent semantic analysis

Authors:
Vinay Bhat;Tim Oates;Vishal Shanbhag;Charles Nicholas
Affiliations:
Department of Computer Science and Electrical Engineering, University of Maryland Baltimore County, Baltimore, MD;Department of Computer Science and Electrical Engineering, University of Maryland Baltimore County, Baltimore, MD;Department of Computer Science and Electrical Engineering, University of Maryland Baltimore County, Baltimore, MD;Department of Computer Science and Electrical Engineering, University of Maryland Baltimore County, Baltimore, MD
Venue:
Data & Knowledge Engineering - Special issue: WIDM 2002
Year:
2004

Citing 7
Cited 5

Automating the assignment of submitted manuscripts to reviewers

SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Personalized information delivery: an analysis of information filtering methods

Communications of the ACM - Special issue on information filtering
Class-based n-gram models of natural language

Computational Linguistics
Improving text retrieval for the routing problem using latent semantic indexing

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Computational Methods for Intelligent Information Access

Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Recommending from content: preliminary results from an e-commerce experiment

CHI '00 Extended Abstracts on Human Factors in Computing Systems

Using recursive ART network to construction domain ontology based on term frequency and inverse document frequency

Expert Systems with Applications: An International Journal
The CONCUR framework forcommunity maintenance of curated resources

Proceedings of the eighth ACM symposium on Document engineering
Automatic discovery of synonyms and lexicalizations from the Web

Proceedings of the 2005 conference on Artificial Intelligence Research and Development
Development of new techniques to improve web search

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
The vector space models for finding co-occurrence names as aliases in Thai sports news

ACIIDS'10 Proceedings of the Second international conference on Intelligent information and database systems: Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

A common problem faced when gathering information from the web is the use of different names to refer to the same entity. For example, the city in India referred to as Bombay in some documents may be referred to as Mumbai in others because its name officially changed from the former to the latter in 1995. Multiplicity of names can cause relevant documents to be missed by search engines. Our goal is to develop an automated system that discovers additional names for an entity given just one of its names. Latent semantic analysis (LSA) is generally thought to be well-suited for this task [Numerical linear algebra with applications 3(4) (1996) 301]. We demonstrate empirically that under a broad range of circumstances LSA performs poorly, and describe a two-stage algorithm based on LSA that performs significantly better.