Designing the user interface (2nd ed.): strategies for effective human-computer interaction
Designing the user interface (2nd ed.): strategies for effective human-computer interaction
Subtopic structuring for full-length document access
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
A classification of visual representations
Communications of the ACM
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A vector space model for automatic indexing
Communications of the ACM
Corpus-based statistical screening for content-bearing terms
Journal of the American Society for Information Science and Technology
A news story categorization system
ANLC '88 Proceedings of the second conference on Applied natural language processing
Hi-index | 0.00 |
Construction is one of the most information intensive industries; typically professionals access the industry information resources on a daily basis. The major constraints to the future development of a formally encoded knowledge base are fragmented information sources and lack of comprehensive classification schemes. In agreement with earlier research and over twenty years of practical experience we have found that manually categorising a large collection of documents is error-prone, time-consuming, expensive and produces inconsistent results. Attempts over recent years to automate this using state-of-the-art categorisation techniques, have also proven to be wanting due to the shallow internal representation in the document set. In this paper we describe an approach to overcome this problem by combining the benefits of automated categorisation with efficient and effective use of human judgement. We present a tool based on this philosophy that utilises machine learning, information retrieval and information visualisation techniques to help bibliographers analyse the document collection. By analysing the content of the unstructured document, this tool suggests to the bibliographer keywords, subject headings and candidate documents to include under subject headings. This greatly increases the speed at which bibliographers can process the documents, increases the accuracy of their work and results in a categorisation system that reflects the terminology and relationships held in the actual knowledge base. This work is now being applied to enhance one of the market leading retrieval products in the construction industry.