Modern Information Retrieval
Scalable Hierarchical Clustering Method for Sequences of Categorical Values
PAKDD '01 Proceedings of the 5th Pacific-Asia Conference on Knowledge Discovery and Data Mining
Approximating a collection of frequent sets
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Automatic Pattern-Taxonomy Extraction for Web Mining
WI '04 Proceedings of the 2004 IEEE/WIC/ACM International Conference on Web Intelligence
Summarizing itemset patterns: a profile-based approach
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Co-occurrence Retrieval: A Flexible Framework for Lexical Distributional Similarity
Computational Linguistics
Deploying Approaches for Pattern Refinement in Text Mining
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data (Data-Centric Systems and Applications)
Topics Identification Based on Event Sequence Using Co-occurrence Words
NLDB '08 Proceedings of the 13th international conference on Natural Language and Information Systems: Applications of Natural Language to Information Systems
Mining positive and negative patterns for relevance feature discovery
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Discovery of numerous specific topics via term co-occurrence analysis
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Hi-index | 0.00 |
With the overwhelming increase in the amount of texts on the web, it is almost impossible for people to keep abreast of up-to-date information. Text mining is a process by which interesting information is derived from text through the discovery of patterns and trends. Text mining algorithms are used to guarantee the quality of extracted knowledge. However, the extracted patterns using text or data mining algorithms or methods leads to noisy patterns and inconsistency. Thus, different challenges arise, such as the question of how to understand these patterns, whether the model that has been used is suitable, and if all the patterns that have been extracted are relevant. Furthermore, the research raises the question of how to give a correct weight to the extracted knowledge. To address these issues, this paper presents a text post-processing method, which uses a pattern co-occurrence matrix to find the relation between extracted patterns in order to reduce noisy patterns. The main objective of this paper is not only reducing the number of closed sequential patterns, but also improving the performance of pattern mining as well. The experimental results on Reuters Corpus Volume 1 data collection and TREC filtering topics show that the proposed method is promising.