On the limited memory BFGS method for large scale optimization
Mathematical Programming: Series A and B
A maximum entropy approach to natural language processing
Computational Linguistics
Inducing Features of Random Fields
IEEE Transactions on Pattern Analysis and Machine Intelligence
A decision-theoretic generalization of on-line learning and an application to boosting
Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals
RECOMB '03 Proceedings of the seventh annual international conference on Research in computational molecular biology
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Maximum Entropy Markov Models for Information Extraction and Segmentation
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Table extraction using conditional random fields
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Training conditional random fields via gradient tree boosting
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Chunking with support vector machines
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Shallow parsing with conditional random fields
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
A comparison of algorithms for maximum entropy parameter estimation
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Chinese segmentation and new word detection using conditional random fields
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Interactive information extraction with constrained conditional random fields
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Multiscale conditional random fields for image labeling
CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Efficiently inducing features of conditional random fields
UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
Improving discriminative sequential learning by discovering important association of statistics
ACM Transactions on Asian Language Information Processing (TALIP)
Twain: Two-end association miner with precise frequent exhibition periods
ACM Transactions on Knowledge Discovery from Data (TKDD)
Hi-index | 0.00 |
Discriminative sequential learning models like Conditional Random Fields (CRFs) have achieved significant success in several areas such as natural language processing or information extraction. Their key advantage is the ability to capture various non--independent and overlapping features of inputs. However, several unexpected pitfalls have a negative influence on the model's performance; these mainly come from an imbalance among classes/labels, irregular phenomena, and potential ambiguity in the training data. This paper presents a data--driven approach that can deal with such hard--to--predict data instances by discovering and emphasizing rare--but--important associations of statistics hidden in the training data. Mined associations are then incorporated into these models to deal with difficult examples. Experimental results of English phrase chunking and named entity recognition using CRFs show a significant improvement in accuracy. In addition to the technical perspective, our approach also highlights a potential connection between association mining and statistical learning by offering an alternative strategy to enhance learning performance with interesting and useful patterns discovered from large dataset.