Transducing Markov sequences

Authors:
Benny Kimelfeld;Christopher Ré
Affiliations:
IBM Research - Almaden, San Jose, CA, USA;University of Wisconsin-Madison, Madison, WI, USA
Venue:
Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Year:
2010

Citing 46
Cited 4

On generating all maximal independent sets

Information Processing Letters
Probabilistic quantifiers and games

Journal of Computer and System Sciences - Structure in Complexity Theory Conference, June 2-5, 1986
Counting classes are at least as hard as the polynomial-time hierarchy

SIAM Journal on Computing
Fixed-Parameter Tractability and Completeness I: Basic Results

SIAM Journal on Computing
Finding the k Shortest Paths

SIAM Journal on Computing
Sequences, datalog, transducers

Journal of Computer and System Sciences - Fourteenth ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems
Counting and random generation of strings in regular languages

Proceedings of the sixth annual ACM-SIAM symposium on Discrete algorithms
On the complexity of database queries

Journal of Computer and System Sciences
Querying sequence databases with transducers

Acta Informatica
Off-Line Handwritten Word Recognition Using a Hidden Markov Model Type Stochastic Network

IEEE Transactions on Pattern Analysis and Machine Intelligence
SEQ: A Model for Sequence Databases

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Typechecking Top-Down Uniform Unranked Tree Transducers

ICDT '03 Proceedings of the 9th International Conference on Database Theory
The complexity of relational query languages (Extended Abstract)

STOC '82 Proceedings of the fourteenth annual ACM symposium on Theory of computing
Optimal aggregation algorithms for middleware

Journal of Computer and System Sciences - Special issu on PODS 2001
Evaluating probabilistic queries over imprecise data

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Clique is hard to approximate within n1-

FOCS '96 Proceedings of the 37th Annual Symposium on Foundations of Computer Science
Finite state transducers approximating Hidden Markov Models

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
State complexity of some operations on binary regular languages

Theoretical Computer Science - Descriptional complexity of formal systems
MYSTIQ: a system for finding more answers by using probabilities

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Shallow parsing with conditional random fields

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Finding and approximating top-k answers in keyword proximity search

Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
The dichotomy of conjunctive queries on probabilistic structures

Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Maximally joining probabilistic data

Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
A transducer-based XML query processor

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Model-driven data acquisition in sensor networks

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Efficient query evaluation on probabilistic databases

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Efficiently enumerating results of keyword search over data graphs

Information Systems
Query efficiency in probabilistic XML models

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Event queries on correlated probabilistic streams

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Approximating predicates and expressive queries on probabilistic databases

Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Generating all maximal induced subgraphs for hereditary and connected-hereditary graph properties

Journal of Computer and System Sciences
Approximate lineage for probabilistic databases

Proceedings of the VLDB Endowment
A compositional query algebra for second-order logic and uncertain databases

Proceedings of the 12th International Conference on Database Theory
Exploiting Lineage for Confidence Computation in Uncertain and Probabilistic Databases

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Database Support for Probabilistic Attributes and Tuples

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Online Filtering, Smoothing and Probabilistic Modeling of Streaming data

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Access Methods for Markovian Streams

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Probabilistic Inference over RFID Streams in Mobile Environments

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Ef?cient Query Evaluation over Temporally Correlated Probabilistic Streams

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Running tree automata on probabilistic XML

Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Indexing correlated probabilistic databases

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
MayBMS: a probabilistic database management system

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Lahar demonstration: warehousing Markovian streams

Proceedings of the VLDB Endowment
Differential approximation of MIN SAT, MAX SAT and related problems

ICCSA'05 Proceedings of the 2005 international conference on Computational Science and Its Applications - Volume Part IV
Multiscale segmentation and anomaly enhancement of SAR imagery

IEEE Transactions on Image Processing

A unified approach to ranking in probabilistic databases

The VLDB Journal — The International Journal on Very Large Data Bases
Lineage for Markovian stream event queries

Proceedings of the 10th ACM International Workshop on Data Engineering for Wireless and Mobile Access
Probabilistic management of OCR data using an RDBMS

Proceedings of the VLDB Endowment
Report on the first workshop on innovative querying of streams

ACM SIGMOD Record

Quantified Score

Hi-index	0.00

Visualization

Abstract

A Markov sequence is a basic statistical model representing uncertain sequential data, and it is used within a plethora of applications, including speech recognition, image processing, computational biology, radio-frequency identification (RFID), and information extraction. The problem of querying a Markov sequence is studied under the conventional semantics of querying a probabilistic database, where queries are formulated as finite-state transducers. Specifically, the complexity of two main problems is analyzed. The first problem is that of computing the confidence (probability) of an answer. The second is the enumeration of the answers in the order of decreasing confidence (with the generation of the top-k answers as a special case), or in an approximate order thereof. In particular, it is shown that enumeration in any sub-exponential-approximate order is generally intractable (even for some fixed transducers), and a matching upper bound is obtained through a proposed heuristic. Due to this hardness, a special consideration is given to restricted (yet common) classes of transducers that extract matches of a regular expression (subject to prefix and suffix constraints), and it is shown that these classes are, indeed, significantly more tractable.