I3R: a new approach to the design of document retrieval systems
Journal of the American Society for Information Science
Recent trends in hierarchic document clustering: a critical review
Information Processing and Management: an International Journal
A look back and a look forward
SIGIR '88 Proceedings of the 11th annual international ACM SIGIR conference on Research and development in information retrieval
Word association norms, mutual information, and lexicography
Computational Linguistics
Semantic Road Maps for Literature Searchers
Journal of the ACM (JACM)
Information Retrieval
Theory of Indexing
Utility of automatic classification systems for information storage and retrieval
Utility of automatic classification systems for information storage and retrieval
The SMART Retrieval System—Experiments in Automatic Document Processing
The SMART Retrieval System—Experiments in Automatic Document Processing
Use of syntactic context to produce term association lists for text retrieval
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Adjusting the performance of an information retrieval system
CIKM '93 Proceedings of the second international conference on Information and knowledge management
Adapting a full-text information retrieval system to the computer troubleshooting domain
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic concept classification of text from electronic meetings
Communications of the ACM
Collaborative Learning of Term-Based Concepts for Automatic Query Expansion
ECML '02 Proceedings of the 13th European Conference on Machine Learning
DAS '02 Proceedings of the 5th International Workshop on Document Analysis Systems V
Automatic word sense discrimination
Computational Linguistics - Special issue on word sense disambiguation
Liveclassifier: creating hierarchical text classifiers through web corpora
Proceedings of the 13th international conference on World Wide Web
Variable precision concepts and its applications for query expansion
ICIC'09 Proceedings of the Intelligent computing 5th international conference on Emerging intelligent computing technology and applications
Hi-index | 0.02 |
This informal note was prompted by discussions and questions at the 1990 AAAI Spring Symposium on Text-Based Intelligent Systems (cf Jacobs 1990). There is a growing interest in access to, and the use of, large scale full-text databases for a variety of purposes, and in the application of classification methods to organise the mass of data involved (see e.g. Church and Hanks 1990). A good deal of work has been done in this field in the past, but it is little known, and some of the early research literature is not very accessible. Classification is an area in which it is easy to make plausible but mistaken assumptions, and as this certainly holds for classification in retrieval, there is a good deal that can be usefully learnt from past experience, most of which was hard won from careful thought and grinding experiment. This paper is intended as an introduction to this initial work on automatic classification, to help those now becoming interested in classification to avoid unnecessarily repeating heavy effort or, more especially, reinventing square wheels. It should also be noted that automatic classification and related (e.g. seriation) methods have been extensively developed for biological applications in particular, but have been more variously applied, and that much of this work may be relevant in the broad area of machine learning.It must be emphasised that as this paper is focussed on early work on automatic classification, particularly for information retrieval, and is designed primarily to lead into this research and its literature, it does not attempt a critical evaluation of the overall results established by now, or of the current state of the art. However it should be pointed out that in the retrieval context in general, as opposed to the wider one of classification as a whole, there has been comparatively little work since the seventies, largely for the reasons indicated in the paper. More recent work in any case refers heavily to earlier research, so this note can be taken as an entry point to the research of the last decade for which some references are given at the end of the note.