ACM SIGKDD Explorations Newsletter
The SOMLib Digital Library System
ECDL '99 Proceedings of the Third European Conference on Research and Advanced Technology for Digital Libraries
LitLinker: capturing connections across the biomedical literature
Proceedings of the 2nd international conference on Knowledge capture
Automatic Pattern-Taxonomy Extraction for Web Mining
WI '04 Proceedings of the 2004 IEEE/WIC/ACM International Conference on Web Intelligence
Automated ontology construction for unstructured text documents
Data & Knowledge Engineering
Mining soft-matching rules from textual data
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Mining "Hidden phrase" definitions from the web
APWeb'03 Proceedings of the 5th Asia-Pacific web conference on Web technologies and applications
An agile process for the creation of conceptual models from content descriptions
ADBIS'07 Proceedings of the 11th East European conference on Advances in databases and information systems
Mining positive and negative patterns for relevance feature discovery
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Rough sets based reasoning and pattern mining for a two-stage information filtering system
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Pattern mining for a two-stage information filtering system
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
A two-stage decision model for information filtering
Decision Support Systems
Sequential pattern mining -- approaches and algorithms
ACM Computing Surveys (CSUR)
Hi-index | 0.00 |
Traditionally, texts have been analysed using various information retrieval related methods, such as full-text analysis, and natural language processing. However, only few examples of data mining in text, particularly in full text, are available.In this paper we show that general data mining methods are applicable to text analysis tasks such as descriptive phrase extraction. Moreover, we present a general framework for text mining. The framework follows the general knowledge discovery process, thus containing steps from preprocessing to the utilization of the results. The data mining method that we apply is based on generalized episodes and episode rules.We give concrete examples of how to preprocess texts based on the intended use of the discovered results and we introduce a weighting scheme that helps in pruning out redundant or non-descriptive phrases. We also present results from real-life data experiments.