Multi-table joins through bitmapped join indices
ACM SIGMOD Record
Foundations of statistical natural language processing
Foundations of statistical natural language processing
Modern Information Retrieval
Join Index Hierarchies for Supporting Efficient Navigations in Object-Oriented Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Using Word Temporal Structure in HMM Speech Recognition
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97)-Volume 2 - Volume 2
A Universal HMM-Based Approach to Image Sequence Classification
ICIP '97 Proceedings of the 1997 International Conference on Image Processing (ICIP '97) 3-Volume Set-Volume 3 - Volume 3
Shallow parsing using specialized hmms
The Journal of Machine Learning Research
HMM-based word alignment in statistical translation
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Conditional structure versus conditional estimation in NLP models
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
The Journal of Machine Learning Research
The list Viterbi training algorithm and its application to keyword search over databases
Proceedings of the 20th ACM international conference on Information and knowledge management
Hi-index | 0.00 |
In this paper, we consider the problem of keyword query cleaning for structured databases from a probabilistic approach. Keyword query cleaning consists of rewriting the user query, segmenting the keywords, matching each segment to database items, and finally tagging the segments by their meta-data information. We present an efficient and robust solution using Hidden Markov Models (HMM). By modeling user keyword queries using a generative probabilistic HMM-based model, we construct a HMM from the user specified keyword query (and the database instance). The optimal statistical keyword cleaning is computed as the most likely path of the constructed HMM. Furthermore, we demonstrate how the optimal HMM-based keyword cleaning algorithm can be generalized to compute a stream of clean queries ranked from the most likely clean query to the least likely clean query. Finally, we present the implementation of the proposed system and its preliminary performance.