Notes and references on early automatic classification work

Authors:
Karen Sparck Jones
Affiliations:
-
Venue:
ACM SIGIR Forum
Year:
1991

Citing 9
Cited 10

I3R: a new approach to the design of document retrieval systems

Journal of the American Society for Information Science
Recent trends in hierarchic document clustering: a critical review

Information Processing and Management: an International Journal
A look back and a look forward

SIGIR '88 Proceedings of the 11th annual international ACM SIGIR conference on Research and development in information retrieval
Word association norms, mutual information, and lexicography

Computational Linguistics
Semantic Road Maps for Literature Searchers

Journal of the ACM (JACM)
Information Retrieval

Information Retrieval
Theory of Indexing

Theory of Indexing
Utility of automatic classification systems for information storage and retrieval

Utility of automatic classification systems for information storage and retrieval
The SMART Retrieval System—Experiments in Automatic Document Processing

The SMART Retrieval System—Experiments in Automatic Document Processing

Use of syntactic context to produce term association lists for text retrieval

SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Concept based query expansion

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Adjusting the performance of an information retrieval system

CIKM '93 Proceedings of the second international conference on Information and knowledge management
Adapting a full-text information retrieval system to the computer troubleshooting domain

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Automatic concept classification of text from electronic meetings

Communications of the ACM
Collaborative Learning of Term-Based Concepts for Automatic Query Expansion

ECML '02 Proceedings of the 13th European Conference on Machine Learning
Improving Document Retrieval by Automatic Query Expansion Using Collaborative Learning of Term-Based Concepts

DAS '02 Proceedings of the 5th International Workshop on Document Analysis Systems V
Automatic word sense discrimination

Computational Linguistics - Special issue on word sense disambiguation
Liveclassifier: creating hierarchical text classifiers through web corpora

Proceedings of the 13th international conference on World Wide Web
Variable precision concepts and its applications for query expansion

ICIC'09 Proceedings of the Intelligent computing 5th international conference on Emerging intelligent computing technology and applications

Quantified Score

Hi-index	0.02

Visualization

Abstract

This informal note was prompted by discussions and questions at the 1990 AAAI Spring Symposium on Text-Based Intelligent Systems (cf Jacobs 1990). There is a growing interest in access to, and the use of, large scale full-text databases for a variety of purposes, and in the application of classification methods to organise the mass of data involved (see e.g. Church and Hanks 1990). A good deal of work has been done in this field in the past, but it is little known, and some of the early research literature is not very accessible. Classification is an area in which it is easy to make plausible but mistaken assumptions, and as this certainly holds for classification in retrieval, there is a good deal that can be usefully learnt from past experience, most of which was hard won from careful thought and grinding experiment. This paper is intended as an introduction to this initial work on automatic classification, to help those now becoming interested in classification to avoid unnecessarily repeating heavy effort or, more especially, reinventing square wheels. It should also be noted that automatic classification and related (e.g. seriation) methods have been extensively developed for biological applications in particular, but have been more variously applied, and that much of this work may be relevant in the broad area of machine learning.It must be emphasised that as this paper is focussed on early work on automatic classification, particularly for information retrieval, and is designed primarily to lead into this research and its literature, it does not attempt a critical evaluation of the overall results established by now, or of the current state of the art. However it should be pointed out that in the retrieval context in general, as opposed to the wider one of classification as a whole, there has been comparatively little work since the seventies, largely for the reasons indicated in the paper. More recent work in any case refers heavily to earlier research, so this note can be taken as an entry point to the research of the last decade for which some references are given at the end of the note.