Concept-Based Information Retrieval Using Explicit Semantic Analysis

Authors:
Ofer Egozi;Shaul Markovitch;Evgeniy Gabrilovich
Affiliations:
Technion---Israel Institute of Technology;Technion---Israel Institute of Technology;Technion---Israel Institute of Technology
Venue:
ACM Transactions on Information Systems (TOIS)
Year:
2011

Citing 38
Cited 19

Combining multiple evidence from different properties of weighting schemes

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Learning routing queries in a query zone

Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Improving automatic query expansion

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
How reliable are the results of large-scale information retrieval experiments?

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic latent semantic indexing

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Improving the effectiveness of information retrieval with local context analysis

ACM Transactions on Information Systems (TOIS)
Effective ranking with arbitrary passages

Journal of the American Society for Information Science and Technology
Predicting query performance

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Passage retrieval based on language models

Proceedings of the eleventh international conference on Information and knowledge management
Fusion Via a Linear Combination of Scores

Information Retrieval
Retrieving with Good Sense

Information Retrieval
A Comparative Study on Feature Selection in Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Word sense disambiguation in information retrieval revisited

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Document Length Normalization

Document Length Normalization
An introduction to variable and feature selection

The Journal of Machine Learning Research
Negative pseudo-relevance feedback in content-based video retrieval

MULTIMEDIA '03 Proceedings of the eleventh ACM international conference on Multimedia
A survey on the use of relevance feedback for information access systems

The Knowledge Engineering Review
Questioning query expansion: an examination of behaviour and parameters

ADC '04 Proceedings of the 15th Australasian database conference - Volume 27
Concept-Based Information Access

ITCC '05 Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC'05) - Volume I - Volume 01
Knowledge-based query expansion to support scenario-specific retrieval of medical free text

Proceedings of the 2005 ACM symposium on Applied computing
Conceptual query expansion

Data & Knowledge Engineering
Pachinko allocation: DAG-structured mixture models of topic correlations

ICML '06 Proceedings of the 23rd international conference on Machine learning
LDA-based document models for ad-hoc retrieval

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
An Adaptation of the Vector-Space Model for Ontology-Based Information Retrieval

IEEE Transactions on Knowledge and Data Engineering
Applying Data Mining to Pseudo-Relevance Feedback for High Performance Text Retrieval

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
A knowledge-based search engine powered by wikipedia

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
A study of query length

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
A Comparative Study of Utilizing Topic Models for Information Retrieval

ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
The ESA retrieval model revisited

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Overcoming the brittleness bottleneck using wikipedia: enhancing text categorization with encyclopedic knowledge

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Importance of semantic representation: dataless classification

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Text categorization with knowledge transfer from heterogeneous data sources

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Concept-based feature generation and selection for information retrieval

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Computing semantic relatedness using Wikipedia-based explicit semantic analysis

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Feature generation for text categorization using world knowledge

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Improvements that don't add up: ad-hoc retrieval results since 1998

Proceedings of the 18th ACM conference on Information and knowledge management
A Wikipedia-based multilingual retrieval model

ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Using concept-based indexing to improve language modeling approach to genomic IR

ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval

Interest Aware Recommendations Based on Adaptive User Profiling

WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 03
Integrating explicit semantic analysis for ontology-based resource selection

Proceedings of the 13th International Conference on Information Integration and Web-based Applications and Services
Exploiting Wikipedia for cross-lingual and multilingual information retrieval

Data & Knowledge Engineering
Enhanced semantic TV-show representation for personalized electronic program guides

UMAP'12 Proceedings of the 20th international conference on User Modeling, Adaptation, and Personalization
Mining interests for user profiling in electronic conversations

Expert Systems with Applications: An International Journal
Entity based Q&A retrieval

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
On the self-similarity of intertextual structures in Wikipedia

Proceedings of the First ACM International Workshop on Hot Topics on Interdisciplinary Social Networks Research
Collaboratively built semi-structured content and Artificial Intelligence: The story so far

Artificial Intelligence
Computing text semantic relatedness using the contents and links of a hypertext encyclopedia

Artificial Intelligence
Graph-based concept weighting for medical information retrieval

Proceedings of the Seventeenth Australasian Document Computing Symposium
A concept identification method for Vietnamese concept-based information retrieval system

Proceedings of the 14th International Conference on Information Integration and Web-based Applications & Services
A location-based news article recommendation with explicit localized semantic analysis

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Unsupervised latent concept modeling to identify query facets

Proceedings of the 10th Conference on Open Research Areas in Information Retrieval
Exploiting DBpedia for web search results clustering

Proceedings of the 2013 workshop on Automated knowledge base construction
Collaborative pseudo-relevance feedback

Expert Systems with Applications: An International Journal
Scope of ontological annotation in e-commerce

International Journal of Business Information Systems
Semantic smoothing for text clustering

Knowledge-Based Systems
Exploratory search with semantic transformations using collaborative knowledge bases

Proceedings of the 7th ACM international conference on Web search and data mining
Knowledge-based graph document modeling

Proceedings of the 7th ACM international conference on Web search and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Information retrieval systems traditionally rely on textual keywords to index and retrieve documents. Keyword-based retrieval may return inaccurate and incomplete results when different keywords are used to describe the same concept in the documents and in the queries. Furthermore, the relationship between these related keywords may be semantic rather than syntactic, and capturing it thus requires access to comprehensive human world knowledge. Concept-based retrieval methods have attempted to tackle these difficulties by using manually built thesauri, by relying on term cooccurrence data, or by extracting latent word relationships and concepts from a corpus. In this article we introduce a new concept-based retrieval approach based on Explicit Semantic Analysis (ESA), a recently proposed method that augments keyword-based text representation with concept-based features, automatically extracted from massive human knowledge repositories such as Wikipedia. Our approach generates new text features automatically, and we have found that high-quality feature selection becomes crucial in this setting to make the retrieval more focused. However, due to the lack of labeled data, traditional feature selection methods cannot be used, hence we propose new methods that use self-generated labeled training data. The resulting system is evaluated on several TREC datasets, showing superior performance over previous state-of-the-art results.