Dynamic Conditional Random Fields: Factorized Probabilistic Models for Labeling and Segmenting Sequence Data

Authors:
Charles Sutton;Andrew McCallum;Khashayar Rohanimanesh
Affiliations:
-;-;-
Venue:
The Journal of Machine Learning Research
Year:
2007

Citing 38
Cited 42

A model for reasoning about persistence and causation

Computational Intelligence
Original Contribution: Stacked generalization

Neural Networks
Some advances in transformation-based part of speech tagging

AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
A maximum entropy approach to natural language processing

Computational Linguistics
Factorial Hidden Markov Models

Machine Learning - Special issue on learning with probabilistic representations
Learning Information Extraction Rules for Semi-Structured and Free Text

Machine Learning - Special issue on natural language learning
Foundations of statistical natural language processing

Foundations of statistical natural language processing
Relational learning of pattern-match rules for information extraction

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
The Hierarchical Hidden Markov Model: Analysis and Applications

Machine Learning
Training products of experts by minimizing contrastive divergence

Neural Computation
Factorial Markov Random Fields

ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part III
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Maximum Entropy Markov Models for Information Extraction and Segmentation

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Table extraction using conditional random fields

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Machine learning for information extraction in informal domains

Machine learning for information extraction in informal domains
A family of algorithms for approximate bayesian inference

A family of algorithms for approximate bayesian inference
Stochastic processes on graphs with cycles: geometric and variational approaches

Stochastic processes on graphs with cycles: geometric and variational approaches
Dynamic bayesian networks: representation, inference and learning

Dynamic bayesian networks: representation, inference and learning
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Dynamic conditional random fields: factorized probabilistic models for labeling and segmenting sequence data

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Chunking with support vector machines

NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Shallow parsing with conditional random fields

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Learning to extract information from semi-structured text using a discriminative context free grammar

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to the CoNLL-2000 shared task: chunking

ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
A comparison of algorithms for maximum entropy parameter estimation

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Accelerated training of conditional random fields with stochastic gradient methods

ICML '06 Proceedings of the 23rd international conference on Machine learning
Online large-margin training of dependency parsers

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Composition of conditional random fields for transfer learning

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Policy recognition in the abstract hidden Markov model

Journal of Artificial Intelligence Research
Bayesian information extraction network

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Hierarchical hidden Markov models for information extraction

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Adaptive information extraction from text by rule induction and generalisation

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Relational learning via propositional algorithms: an information extraction case study

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Multiscale conditional random fields for image labeling

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Loopy belief propagation for approximate inference: an empirical study

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Discriminative probabilistic models for relational data

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Constructing free-energy approximations and generalized belief propagation algorithms

IEEE Transactions on Information Theory

A unified architecture for natural language processing: deep neural networks with multitask learning

Proceedings of the 25th international conference on Machine learning
Real world activity recognition with multiple goals

UbiComp '08 Proceedings of the 10th international conference on Ubiquitous computing
An algorithm for analyzing personalized online commercial intention

Proceedings of the 2nd International Workshop on Data Mining and Audience Intelligence for Advertising
Keyword query cleaning using hidden Markov models

Proceedings of the First International Workshop on Keyword Search on Structured Data
Learning-based named entity recognition for morphologically-rich, resource-scarce languages

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
CIGAR: concurrent and interleaving goal and activity recognition

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 3
Feature selection for activity recognition in multi-robot domains

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 3
Chatting activity recognition in social occasions using factorial conditional random fields with iterative classification

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 3
Gesture salience as a hidden variable for coreference resolution and keyframe extraction

Journal of Artificial Intelligence Research
Review: The use of pervasive sensing for behaviour profiling - a survey

Pervasive and Mobile Computing
Probabilistic models for concurrent chatting activity recognition

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Selecting features of linear-chain conditional random fields via greedy stage-wise algorithms

Pattern Recognition Letters
Piecewise training for structured prediction

Machine Learning
Using Conditional Random Fields for Decision-Theoretic Planning

MDAI '09 Proceedings of the 6th International Conference on Modeling Decisions for Artificial Intelligence
A Study of Parts-Based Object Class Detection Using Complete Graphs

International Journal of Computer Vision
Object relevance weight pattern mining for activity recognition and segmentation

Pervasive and Mobile Computing
Real-time activity classification using ambient and wearable sensors

IEEE Transactions on Information Technology in Biomedicine - Special section on body sensor networks
CarpeDiem: Optimizing the Viterbi Algorithm and Applications to Supervised Sequential Learning

The Journal of Machine Learning Research
A discriminative model corresponding to hierarchical HMMs

IDEAL'07 Proceedings of the 8th international conference on Intelligent data engineering and automated learning
Sports video segmentation using a hierarchical hidden CRF

ICONIP'08 Proceedings of the 15th international conference on Advances in neuro-information processing - Volume Part I
Advances in view-invariant human motion analysis: a review

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Decision detection using hierarchical graphical models

ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Turbo parsers: dependency parsing by approximate variational inference

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Better punctuation prediction with dynamic conditional random fields

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Natural language querying over databases using cascaded CRFs

ADBIS'10 Proceedings of the 14th east European conference on Advances in databases and information systems
Probabilistic models for concurrent chatting activity recognition

ACM Transactions on Intelligent Systems and Technology (TIST)
Learning the behavior model of a robot

Autonomous Robots
Jointly identifying entities and extracting relations in encyclopedia text via a graphical model approach

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Multi-dimensional classification with Bayesian networks

International Journal of Approximate Reasoning
Recognizing multi-user activities using wearable sensors in a smart home

Pervasive and Mobile Computing
Language models as representations for weakly-supervised NLP tasks

CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
Towards a top-down and bottom-up bidirectional approach to joint information extraction

Proceedings of the 20th ACM international conference on Information and knowledge management
Natural Language Processing (Almost) from Scratch

The Journal of Machine Learning Research
Self-supervised capturing of users' activities from weblogs

International Journal of Intelligent Information and Database Systems
Multilayer sequence labeling

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Energy-Aware agents for detecting nonessential appliances

PRIMA'10 Proceedings of the 13th international conference on Principles and Practice of Multi-Agent Systems
PROBABILISTIC MODELS FOR FOCUSED WEB CRAWLING

Computational Intelligence
Minimum-risk training of approximate CRF-based NLP systems

NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
A novel discriminative framework for sentence-level discourse analysis

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Joint bilingual name tagging for parallel corpora

Proceedings of the 21st ACM international conference on Information and knowledge management
An inference-based model of word meaning in context as a paraphrase distribution

ACM Transactions on Intelligent Systems and Technology (TIST) - Special Sections on Paraphrasing; Intelligent Systems for Socially Aware Computing; Social Computing, Behavioral-Cultural Modeling, and Prediction
Maximum-entropy word alignment and posterior-based phrase extraction for machine translation

Machine Translation

Quantified Score

Hi-index	0.00

Visualization

Abstract

In sequence modeling, we often wish to represent complex interaction between labels, such as when performing multiple, cascaded labeling tasks on the same sequence, or when long-range dependencies exist. We present dynamic conditional random fields (DCRFs), a generalization of linear-chain conditional random fields (CRFs) in which each time slice contains a set of state variables and edges---a distributed state representation as in dynamic Bayesian networks (DBNs)---and parameters are tied across slices. Since exact inference can be intractable in such models, we perform approximate inference using several schedules for belief propagation, including tree-based reparameterization (TRP). On a natural-language chunking task, we show that a DCRF performs better than a series of linear-chain CRFs, achieving comparable performance using only half the training data. In addition to maximum conditional likelihood, we present two alternative approaches for training DCRFs: marginal likelihood training, for when we are primarily interested in predicting only a subset of the variables, and cascaded training, for when we have a distinct data set for each state variable, as in transfer learning. We evaluate marginal training and cascaded training on both synthetic data and real-world text data, finding that marginal training can improve accuracy when uncertainty exists over the latent variables, and that for transfer learning, a DCRF trained in a cascaded fashion performs better than a linear-chain CRF that predicts the final task directly.