Data mining solutions: methods and tools for solving real-world problems
Data mining solutions: methods and tools for solving real-world problems
Document Warehousing and Text Mining: Techniques for Improving Business Operations, Marketing, and Sales
Data Mining: Technologies, Techniques, Tools, and Trends
Data Mining: Technologies, Techniques, Tools, and Trends
Data Warehousing, Data Mining, and Olap
Data Warehousing, Data Mining, and Olap
Database Systems Design, Implementation and Management
Database Systems Design, Implementation and Management
Principles of Information Systems: A Managerial Approach
Principles of Information Systems: A Managerial Approach
Extraction and representation of contextual information for knowledge discovery in texts
Information Sciences—Informatics and Computer Science: An International Journal
Text analysis and knowledge mining system
IBM Systems Journal
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Automated text summarization and the SUMMARIST system
TIPSTER '98 Proceedings of a workshop on held at Baltimore, Maryland: October 13-15, 1998
Toward total business intelligence incorporating structured and unstructured data
Proceedings of the 2nd International Workshop on Business intelligencE and the WEB
Hi-index | 0.00 |
When a new discipline emerges it usually takes some time and lots of academic discussion before concepts and terms get standardised. Such a new discipline is text mining. In a groundbreaking paper, Untangling text data mining, Hearst [1999] tackled the problem of clarifying text-mining concepts and terminology. This essay aims to build on Hearst's ideas by pointing out some inconsistencies and suggesting an improved and extended categorisation of data- and text-mining techniques. The essay is a conceptual study. A short overview of the problems regarding text-mining concepts is given. This is followed by a summary and critical discussion of Hearst's attempt to clarify the terminology. The essence of text mining is found to be the discovery or creation of new knowledge from a collection of documents. The parameters of non-novel, semi-novel and novel investigation are used to differentiate between full-text information retrieval, standard text mining and intelligent text mining. The same parameters are also used to differentiate between related processes for numerical data and text metadata. These distinctions may be used as a road map in the evolving fields of data/information retrieval, knowledge discovery and the creation of new knowledge.