Mining context specific similarity relationships using the world wide web

Authors:
Dmitri Roussinov;Leon J. Zhao;Weiguo Fan
Affiliations:
Arizona State University, Tempe, AZ;University of Arizona, Tucson, AZ;Virginia Tech, Blacksburg, VA
Venue:
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Year:
2005

Citing 15
Cited 2

The vocabulary problem in human-system communication

Communications of the ACM
Representation and learning in information retrieval

Representation and learning in information retrieval
Concept based query expansion

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Query expansion using lexical-semantic relations

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
A cooccurrence-based thesaurus and two applications to information retrieval

Information Processing and Management: an International Journal
Resolving ambiguity for cross-language retrieval

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Improving two-stage ad-hoc retrieval for short queries

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Improving the effectiveness of information retrieval with local context analysis

ACM Transactions on Information Systems (TOIS)
Query routing for Web search engines: architectures and experiments

Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Evaluating strategies for similarity search on the web

Proceedings of the 11th international conference on World Wide Web
Explorations in Automatic Thesaurus Discovery

Explorations in Automatic Thesaurus Discovery
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Web Mining: Information and Pattern Discovery on the World Wide Web

ICTAI '97 Proceedings of the 9th International Conference on Tools with Artificial Intelligence
Word association norms, mutual information, and lexicography

ACL '89 Proceedings of the 27th annual meeting on Association for Computational Linguistics
Automatic acquisition of hyponyms from large text corpora

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2

Detecting word substitutions: PMI vs. HMM

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Measures to detect word substitution in intercepted communication

ISI'06 Proceedings of the 4th IEEE international conference on Intelligence and Security Informatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

We have studied how context specific web corpus can be automatically created and mined for discovering semantic similarity relationships between terms (words or phrases) from a given collection of documents (target collection). These relationships between terms can be used to adjust the standard vectors space representation so as to improve the accuracy of similarity computation between text documents in the target collection. Our experiments with a standard test collection (Reuters) have revealed the reduction of similarity errors by up to 50%, twice as much as the improvement by using other known techniques.