Robust text processing in automated information retrieval

Authors:
Tomek Strzalkowski
Affiliations:
New York University, New York, NY
Venue:
ANLC '94 Proceedings of the fourth conference on Applied natural language processing
Year:
1994

Citing 5
Cited 6

Natural Language Information Processing: A Computer Grammmar of English and Its Applications

Natural Language Information Processing: A Computer Grammmar of English and Its Applications
Robust text processing in automated information retrieval

ANLC '94 Proceedings of the fourth conference on Applied natural language processing
TTP: a fast and robust parser for natural language

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 1
The importance of proper weighting methods

HLT '93 Proceedings of the workshop on Human Language Technology
Query processing for retrieval from large text bases

HLT '93 Proceedings of the workshop on Human Language Technology

Retrieval from captioned image databases using natural language processing

Proceedings of the ninth international conference on Information and knowledge management
Automatic text categorization in terms of genre and author

Computational Linguistics
A natural language system for retrieval of captioned images

Natural Language Engineering
Robust text processing in automated information retrieval

ANLC '94 Proceedings of the fourth conference on Applied natural language processing
A two-stage decision model for information filtering

Decision Support Systems
A social network-empowered research analytics framework for project selection

Decision Support Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

We report on the results of a series of experiments with a prototype text retrieval system which uses relatively advanced natural language processing techniques in order to enhance the effectiveness of statistical document retrieval. In this paper we show that large-scale natural language processing (hundreds of millions of words and more) is not only required for a better retrieval, but it is also doable, given appropriate resources. In particular, we demonstrate that the use of syntactic compounds in the representation of database documents as well as in the user queries, coupled with an appropriate term weighting strategy, can considerably improve the effectiveness of retrospective search. The experiments reported here were conducted on TIPSTER database in connection with the Text REtrieval Conference series (TREC).