Corpus-based learning of compound noun indexing

Authors:
Byung-Kwan Kwak;Jee-Hyub Kim;Geunbae Lee;Jung Yun Seo
Affiliations:
Pohang University of Science & Technology (POSTECH);Pohang University of Science & Technology (POSTECH);Pohang University of Science & Technology (POSTECH);Sogang University
Venue:
RANLPIR '00 Proceedings of the ACL-2000 workshop on Recent advances in natural language processing and information retrieval: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 11
Year:
2000

Citing 11
Cited 2

The effectiveness of a nonsyntatic approach to automatic phrase indexing for document retrieval

Journal of the American Society for Information Science
Word association norms, mutual information, and lexicography

Computational Linguistics
Combining multiple evidence from different properties of weighting schemes

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Using n-grams for Korean text retrieval

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Phrase processing methods for Japanese text retrieval

ACM SIGIR Forum
Information Retrieval

Information Retrieval
A Comparative Study on Feature Selection in Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Extending the boolean and vector space models of information retrieval with p-norm queries and multiple concept types

Extending the boolean and vector space models of information retrieval with p-norm queries and multiple concept types
A corpus-based approach to automatic compound extraction

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Noun-phrase analysis in unrestricted text for information retrieval

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Natural language information retrieval: TIPSTER-2 final report

TIPSTER '96 Proceedings of a workshop on held at Vienna, Virginia: May 6-8, 1996

On the Usefulness of Extracting Syntactic Dependencies for Text Indexing

AICS '02 Proceedings of the 13th Irish International Conference on Artificial Intelligence and Cognitive Science
Lexical and Syntactic knowledge for Information Retrieval

Information Processing and Management: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we present a corpus-based learning method that can index diverse types of compound nouns using rules automatically extracted from a large tagged corpus. We develop an efficient way of extracting the compound noun indexing rules automatically and perform extensive experiments to evaluate our indexing rules. The automatic learning method shows about the same performance compared with the manual linguistic approach but is more portable and requires no human efforts. We also evaluate the seven different filtering methods based on both the effectiveness and the efficiency, and present a new method to solve the problems of compound noun over-generation and data sparseness in statistical compound noun processing.