The vocabulary problem in human-system communication
Communications of the ACM
Needs for research in indexing
Journal of the American Society for Information Science
Communications of the ACM
Translating collocations for bilingual lexicons: a statistical approach
Computational Linguistics
Exploiting clustering and phrases for context-based information retrieval
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Browsing in digital libraries: a phrase-based approach
DL '97 Proceedings of the second ACM international conference on Digital libraries
Comparing noun phrasing techniques for use with medical digital library tools
Journal of the American Society for Information Science - Special topic issue on digital libraries: part 2
Improving browsing in digital libraries with keyphrase indexes
Decision Support Systems - From information retrieval to knowledge management: enabling technologies and best practices
A usability assessment of online indexing structures in the networked environment
Journal of the American Society for Information Science
Automatic abstracting and indexing—survey and recommendations
Communications of the ACM
Indexing Books
Evaluation of automatically identified index terms for browsing electronic documents
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Building effective queries in natural language information retrieval
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
An automated system that assists in the generation of document indexes
Natural Language Engineering
Noun-phrase analysis in unrestricted text for information retrieval
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
HLT '93 Proceedings of the workshop on Human Language Technology
Compound descriptors in context: a matching function for classifications and thesauri
Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries
A prototype multilingual document browser for ancient Greek texts
The New Review of Hypermedia and Multimedia
DOM-based content extraction of HTML documents
WWW '03 Proceedings of the 12th international conference on World Wide Web
Methods for precise named entity matching in digital collections
Proceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries
Automating Content Extraction of HTML Documents
World Wide Web
The influence of indexing practices and weighting algorithms on document spaces
Journal of the American Society for Information Science and Technology
Improving XML search by generating and utilizing informative result snippets
ACM Transactions on Database Systems (TODS)
Using natural language processing to assist the visually handicapped in writing compositions
AI'06 Proceedings of the 19th international conference on Advances in Artificial Intelligence: Canadian Society for Computational Studies of Intelligence
Hi-index | 0.00 |
The potential of automatically generated indexes for information acces s has been recognized for several decades (e.g., Bush 1945 [2], Edmundson and Wyllys 1961 [4]), but the quantity of text and the ambiguity of natural language processing have made progress at this task more difficult than was originally foreseen. Recently, a body of work on development of interactive systems to support phrase browsing has begun to emerge (e.g., Anick and Vaithyanathan 1997 [1], Gutwin et al. [10], Nevill-Manning et al. 1997 [17], Godby and Reighart 1998 [9]). In this paper, we consider two issues related to the use of automatically identified phrases as index terms in a dynamic text browser (DTB), a user-centered system for navigating and browsing index terms: 1) What criteria are useful for assessing the usefulness of automatically identified index terms? and 2) Is the quality of the terms identified by automatic indexing such that they provide useful access to document content? The terms that we focus on have been identified by LinkIT, a software tool for identifying significant topics in text [7]. Over 90% of the terms identified by LinkIT are coherent and therefore merit inclusion in the dynamic text browser. Terms identified by LinkIT are input to Intell-Index, a prototype DTB that supports interactive navigation of index terms. The distinction between phrasal heads (the most important words in a coherent term) and modifiers serves as the basis for a hierarchical organization of terms. This linguistically motivated structure helps users to efficiently browsing and disambiguate terms. We conclude that the approach to information access discussed in this paper is very promising, and also that there is much room for further research. In the meantime, this research is a contribution to the establishment of a solid foundation for assessing the usability of terms in phrase browsing applications.