Building effective queries in natural language information retrieval

Authors:
Tomek Strzalkowski;Fang Lin;Jose Perez-Carballo;Jin Wang
Affiliations:
GE Corporate Research & Development, Niskayuna, NY;GE Corporate Research & Development, Niskayuna, NY;Rutgers University, New Brunswick, NJ;GE Corporate Research & Development, Niskayuna, NY
Venue:
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Year:
1997

Citing 5
Cited 8

Towards interactive query expansion

SIGIR '88 Proceedings of the 11th annual international ACM SIGIR conference on Research and development in information retrieval
Query expansion using lexical-semantic relations

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Natural language information retrieval

TREC-2 Proceedings of the second conference on Text retrieval conference
Searching distributed collections with inference networks

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
The importance of proper weighting methods

HLT '93 Proceedings of the workshop on Human Language Technology

Automatic identification and organization of index terms for interactive browsing

Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries
YPA — an Intelligent Directory Enquiry Assistant

BT Technology Journal
Natural Language Processing and Information Retrieval

Information Extraction: Towards Scalable, Adaptable Systems
Extracting Semistructured Data - Lessons Learnt

NLP '00 Proceedings of the Second International Conference on Natural Language Processing
A lightweight dependency analyzer for partial parsing

Natural Language Engineering
Evaluation of automatically identified index terms for browsing electronic documents

ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Enhancing detection through linguistic indexing and topic expansion

TIPSTER '98 Proceedings of a workshop on held at Baltimore, Maryland: October 13-15, 1998
A multi-agent system for web document authoring

AWIC'03 Proceedings of the 1st international Atlantic web intelligence conference on Advances in web intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we report on our natural language information retrieval (NLIR) project as related to the recently concluded 5th Text Retrieval Conference (TREC-5). The main thrust of this project is to use natural language processing techniques to enhance the effectiveness of full-text document retrieval. One of our goals was to demonstrate that robust if relatively shallow NLP can help to derive a better representation of text documents for statistical search. Recently, we have turned our attention away from text representation issues and more towards query development problems. While our NLIR system still performs extensive natural language processing in order to extract phrasal and other indexing terms, our focus has shifted to the problems of building effective search queries. Specifically, we are interested in query construction that uses words, sentences, and entire passages to expand initial topic specifications in an attempt to cover their various angles, aspects and contexts. Based on our earlier results indicating that NLP is more effective with long, descriptive queries, we allowed for long passages from related documents to be liberally imported into the queries. This method appears to have produced a dramatic improvement in the performance of two different statistical search engines that we tested (Cornell's SMART and NIST's Prise) boosting the average precision by at least 40%. In this paper we discuss both manual and automatic procedures for query expansion within a new stream-based information retrieval model.