Fast statistical parsing of noun phrases for document indexing

Authors:
Chengxiang Zhai
Affiliations:
Carnegie Mellon University, Pittsburgh, PA
Venue:
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Year:
1997

Citing 16
Cited 24

Retrieval techniques

Annual review of information science and technology, vol. 22
Progress in the application of natural language processing to information retrieval tasks

The Computer Journal - Special issue on information retrieval
Class-based n-gram models of natural language

Computational Linguistics
Overview of the second text retrieval conference (TREC-2)

TREC-2 Proceedings of the second conference on Text retrieval conference
CLARIT-TREC experiments

TREC-2 Proceedings of the second conference on Text retrieval conference
Natural language information retrieval

TREC-2 Proceedings of the second conference on Text retrieval conference
Natural language processing for information retrieval

Communications of the ACM
Theory of Syntactic Recognition for Natural Languages

Theory of Syntactic Recognition for Natural Languages
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Representation and Learning in Information Retrieval

Representation and Learning in Information Retrieval
Lexical semantic techniques for corpus analysis

Computational Linguistics - Special issue on using large corpora: II
Corpus statistics meet the noun compound: some empirical results

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Conceptual association for compound noun analysis

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Noun-phrase analysis in unrestricted text for information retrieval

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Information retrieval using robust natural language processing

ACL '92 Proceedings of the 30th annual meeting on Association for Computational Linguistics
TTP: a fast and robust parser for natural language

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 1

A Corpus-Based Learning Method of Compound Noun Indexing Rules for Korean

Information Retrieval
Extracting Semistructured Data - Lessons Learnt

NLP '00 Proceedings of the Second International Conference on Natural Language Processing
A corpus-based approach for Korean nominal compound analysis based on linguistic and statistical information

Natural Language Engineering
Extracting molecular binding relationships from biomedical text

ANLC '00 Proceedings of the sixth conference on Applied natural language processing
A risk minimization framework for information retrieval

Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
Integrating linguistic knowledge in passage retrieval for question answering

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Automatic labeling of multinomial topic models

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Unsupervised query segmentation using generative language models and wikipedia

Proceedings of the 17th international conference on World Wide Web
Concept-Based Question Analysis for an Efficient Document Ranking

ICCS '07 Proceedings of the 7th international conference on Computational Science, Part II
Noun Compound Interpretation Using Paraphrasing Verbs: Feasibility Study

AIMSA '08 Proceedings of the 13th international conference on Artificial Intelligence: Methodology, Systems, and Applications
A novel retrieval approach reflecting variability of syntactic phrase representation

Journal of Intelligent Information Systems
Statistical Language Models for Information Retrieval A Critical Review

Foundations and Trends in Information Retrieval
Concept Search

ESWC 2009 Heraklion Proceedings of the 6th European Semantic Web Conference on The Semantic Web: Research and Applications
Word or phrase?: learning which unit to stress for information retrieval

ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2 - Volume 2
A risk minimization framework for information retrieval

Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
Using web-scale N-grams to improve base NP parsing performance

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Combining linguistic indexes to improve the performances of information retrieval systems: a machine learning based solution

Large Scale Semantic Access to Content (Text, Image, Video, and Sound)
Unsupervised query segmentation using clickthrough for information retrieval

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Improving passage retrieval in question answering using NLP

EPIA'05 Proceedings of the 12th Portuguese conference on Progress in Artificial Intelligence
Using verbs to characterize noun-noun relations

AIMSA'06 Proceedings of the 12th international conference on Artificial Intelligence: methodology, Systems, and Applications
Binary lexical relations for text representation in information retrieval

NLDB'05 Proceedings of the 10th international conference on Natural Language Processing and Information Systems
Statistical recognition of noun phrases in unrestricted text

IDA'05 Proceedings of the 6th international conference on Advances in Intelligent Data Analysis
Query generation using semantic features

ICADL'06 Proceedings of the 9th international conference on Asian Digital Libraries: achievements, Challenges and Opportunities
Semantic interpretation of noun compounds using verbal and other paraphrases

ACM Transactions on Speech and Language Processing (TSLP) - Special issue on multiword expressions: From theory to practice and use, part 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

Information Retrieval (IR) is an important application area of Natural Language Processing (NLP) where one encounters the genuine challenge of processing large quantities of unrestricted natural language text. While much effort has been made to apply NLP techniques to IR, very few NLP techniques have been evaluated on a document collection larger than several megabytes. Many NLP techniques are simply not efficient enough, and not robust enough, to handle a large amount of text. This paper proposes a new Probabilistic model for noun phrase parsing, and reports on the application of such a parsing technique to enhance document indexing. The effectiveness of using syntactic phrases provided by the parser to supplement single words for indexing is evaluated with a 250 megabytes document collection. The experiment's results show that supplementing single words with syntactic phrases for indexing consistently and significantly improves retrieval performance.