Text mining

Authors:
Ronen Feldman
Affiliations:
Director, Data Mining Laboratory, Department of Mathematics and Computer Science, Bar-Ilan University, Ramat-Gan, Israel
Venue:
Handbook of data mining and knowledge discovery
Year:
2002

Citing 16
Cited 0

Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
On the application of syntactic methodologies in automatic text analysis

SIGIR '89 Proceedings of the 12th annual international ACM SIGIR conference on Research and development in information retrieval
Word association norms, mutual information, and lexicography

Computational Linguistics
Computer systems that learn: classification and prediction methods from statistics, neural nets, machine learning, and expert systems

Computer systems that learn: classification and prediction methods from statistics, neural nets, machine learning, and expert systems
An evaluation of phrasal and clustered representations on a text categorization task

SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Constant interaction-time scatter/gather browsing of very large document collections

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Automated learning of decision rules for text categorization

ACM Transactions on Information Systems (TOIS)
Cluster-based text categorization: a comparison of category search strategies

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging

Computational Linguistics
Stemming algorithms: a case study for detailed evaluation

Journal of the American Society for Information Science - Special issue: evaluation of information retrieval systems
From data mining to knowledge discovery: an overview

Advances in knowledge discovery and data mining
Exploiting Background Information in Knowledge Discovery from Text

Journal of Intelligent Information Systems
FOIL: A Midterm Report

ECML '93 Proceedings of the European Conference on Machine Learning
Mining Generalized Association Rules

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Accurate methods for the statistics of surprise and coincidence

Computational Linguistics - Special issue on using large corpora: I
Towards automatic extraction of monolingual and bilingual terminology

COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1

Quantified Score

Hi-index	0.00

Visualization

Abstract

The information age is characterized by a rapid growth in the amount of information available in electronic media. Traditional data handling methods are not adequate to cope with this flood of information. Knowledge discovery in databases (KDD) is a new paradigm that focuses on automatic or semiautomatic exploration of large amounts of data and on discovery of relevant and interesting patterns within them. While most work on KDD is concerned with structured databases, it is clear that this paradigm is required for handling the huge amount of information that is available only in unstructured textual form. To apply KDD on texts, it is necessary to impose some structure on the data that would be rich enough to allow for interesting KDD operations. On the other hand, we must consider the severe limitations of current text processing technology and define rather simple structures that can be extracted from texts fairly automatically and at a reasonable cost. One of the options is to use a text categorization/term extraction paradigm to annotate text articles with meaningful concepts that are organized in a hierarchical structure. This relatively simple annotation is rich enough to provide the basis for a novel KDD framework, enabling data summarization, exploration of interesting patterns, and trend analysis.