Discovering text patterns by a new graphic model

Authors:
Minhua Huang;Robert M. Haralick
Affiliations:
Computer Science, Graduate Center, City University of New York, New York, NY;Computer Science, Graduate Center, City University of New York, New York, NY
Venue:
MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
Year:
2011

Citing 15
Cited 0

Automatic labeling of semantic roles

Computational Linguistics
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Maximum Entropy Markov Models for Information Extraction and Segmentation

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Shallow parsing using specialized hmms

The Journal of Machine Learning Research
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Using corpus statistics and WordNet relations for sense identification

Computational Linguistics - Special issue on word sense disambiguation
A stochastic parts program and noun phrase parser for unrestricted text

ANLC '88 Proceedings of the second conference on Applied natural language processing
Word-sense disambiguation using decomposable models

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Shallow parsing with conditional random fields

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Corpus-based statistical sense resolution

HLT '93 Proceedings of the workshop on Human Language Technology
Introduction to the CoNLL-2000 shared task: chunking

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Single-classifier memory-based phrase chunking

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Robust and efficient multiclass SVM models for phrase pattern recognition

Pattern Recognition
Evaluation of utility of LSA for word sense discrimination

NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Semantic role labelling with tree conditional random fields

CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a probabilistic graphical model that works for recognizing three types of text patterns in a sentence: noun phrases; the meaning of an ambiguous word; and semantic arguments of a verb. The model has an unique mathematical expression and graphical representation compared with existing graphic models such as CRFs, HMMs, and MEMMs. In our model, a sequence of optimal categories for a sequence of symbols is determined by finding the optimal category for each symbol independently. Two consequences follows. First, it does not need to employ dynamic programming. The on-line time complexity and memory complexity are reduced. Moreover, the ratio of misclassification will be decreased. Experiments conducted on standard data sets show good results. For instance, our method achieves an average precision of 97.7% and an average recall of 98.8% for recognizing noun phrases on WSJ data from Penn Treebank; an average accuracy of 81.12% for recognizing the six sense word 'line'; an average precision of 92.96% and an average of recall of 94.94% for classifying semantic argument boundaries of a verb of a sentence on WSJ data from Penn Treebank and PropBank. The performance of each task surpasses or approaches the state-of-art level.