Foundations of statistical natural language processing
Foundations of statistical natural language processing
Efficient Subgraph Isomorphism Detection: A Decomposition Approach
IEEE Transactions on Knowledge and Data Engineering
Text analysis and knowledge mining system
IBM Systems Journal
ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
A text-mining system for knowledge discovery from biomedical documents
IBM Systems Journal
Report on KDD conference 2004 panel discussion can natural language processing help text mining?
ACM SIGKDD Explorations Newsletter
Ontology-based natural language query processing for the biological domain
BioNLP '06 Proceedings of the Workshop on Linking Natural Language Processing and Biology: Towards Deeper Biological Literature Analysis
Pattern Mining with Natural Language Processing: An Exploratory Approach
MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition
Ontology-based natural language query processing for the biological domain
LNLBioNLP '06 Proceedings of the HLT-NAACL BioNLP Workshop on Linking Natural Language and Biology
A large-scale system for annotating and querying quotations in news feeds
Proceedings of the 3rd International Semantic Search Workshop
Hi-index | 0.00 |
We present a framework that bridges the gap between natural language processing (NLP) and text mining. Central to this is a new approach to text parameterization that captures many interesting attributes of text usually ignored by standard indices, like the term-document matrix. By storing NLP tags, the new index supports a higher degree of knowledge discovery and pattern finding from text. The index is relatively compact, enabling dynamic search of arbitrary relationships and events in large document collections. We can export search results in formats and data structures that are transparent to statistical analysis tools like S-PLUSID®. In a number of experiments, we demonstrate how this framework can turn mountains of unstructured information into informative statistical graphs.