PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth
Proceedings of the 17th International Conference on Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Association Rules Mining for Name Entity Recognition
WISE '03 Proceedings of the Fourth International Conference on Web Information Systems Engineering
Report on KDD conference 2004 panel discussion can natural language processing help text mining?
ACM SIGKDD Explorations Newsletter
Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data
Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data
Text mining and natural language processing: introduction for the special issue
ACM SIGKDD Explorations Newsletter - Natural language processing and text mining
Extracting statistical data frames from text
ACM SIGKDD Explorations Newsletter - Natural language processing and text mining
Speech and Language Processing (2nd Edition)
Speech and Language Processing (2nd Edition)
Sentiment classification using word sub-sequences and dependency sub-trees
PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Association rules to identify receptor and ligand structures through named entities recognition
IEA/AIE'10 Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part III
Hi-index | 0.00 |
Pattern mining derives from the need of discovering hidden knowledge in very large amounts of data, regardless of the form in which it is presented. When it comes to Natural Language Processing (NLP) , it arose along the humans' necessity of being understood by computers. In this paper we present an exploratory approach that aims at bringing together the best of both worlds. Our goal is to discover patterns in linguistically processed texts, through the usage of NLP state-of-the-art tools and traditional pattern mining algorithms. Articles from a Portuguese newspaper are the input of a series of tests described in this paper. First, they are processed by an NLP chain, which performs a deep linguistic analysis of text; afterwards, pattern mining algorithms Apriori and GenPrefixSpan are used. Results showed the applicability of sequential pattern mining techniques in textual structured data, and also provided several evidences about the structure of the language.