Supporting content retrieval from WWW via “basic level categories” (poster abstract)
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Using WordNet and Lexical Operators to Improve Internet Searches
IEEE Internet Computing
Effective Text Retrieval Based on Combining Evidence from the Corpus and Users
IEEE Expert: Intelligent Systems and Their Applications
Extraction of field-coherent passages
Information Processing and Management: an International Journal
Hi-index | 0.00 |
Intuitively, words forming phrases are a more precise description of content than words as a sequence of keywords. Yet, evidence that phrases would be more effective for information retrieval is inconclusive. This paper isolates a neglected class of phrases, that is abundant in communication, has an established theoretical foundation, and shows promise for an effective expression of the user's information need: the noun-noun compound (NNC). In an experiment, a variety of meaningful NNCs were used to isolate relevant passages in a large and varied corpus. In a first pass, passages were retrieved based on textual proximity of the words or their semantic peers. A second pass retained only passages containing a syntactically coherent structure equivalent to the original NNC. This second pass showed a dramatic increase in precision. Preliminary results show the validity of our intuition about phrases in the special but very productive case of NNCs.