Automatic indexing based on Bayesian inference networks
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
A language modeling approach to information retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A general language model for information retrieval (poster abstract)
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A vector space model for automatic indexing
Communications of the ACM
Text databases & document management
Relevance based language models
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Model-based feedback in the language modeling approach to information retrieval
Proceedings of the tenth international conference on Information and knowledge management
Information-geometric measure for neural spikes
Neural Computation
Automatic word sense discrimination
Computational Linguistics - Special issue on word sense disambiguation
Dependence language model for information retrieval
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Large-Sample Learning of Bayesian Networks is NP-Hard
The Journal of Machine Learning Research
A Markov random field model for term dependencies
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
The SMART Retrieval System—Experiments in Automatic Document Processing
The SMART Retrieval System—Experiments in Automatic Document Processing
A multivariate nonparametric test of independence
Journal of Multivariate Analysis
A variable-length category-based n-gram language model
ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 01
Latent concept expansion using markov random fields
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Discovering key concepts in verbose queries
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Reviewing and Evaluating Automatic Term Recognition Techniques
GoTAL '08 Proceedings of the 6th international conference on Advances in Natural Language Processing
Reducing long queries using query quality predictors
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Corpus-based and knowledge-based measures of text semantic similarity
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
A comparative study of methods for estimating query language models with pseudo feedback
Proceedings of the 18th ACM conference on Information and knowledge management
Learning concept importance using a weighted dependence model
Proceedings of the third ACM international conference on Web search and data mining
ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Improved latent concept expansion using hierarchical markov random fields
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Improving verbose queries using subset distribution
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Using various term dependencies according to their utilities
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Effective Pattern Discovery for Text Mining
IEEE Transactions on Knowledge and Data Engineering
An Efficient Rigorous Approach for Identifying Statistically Significant Frequent Itemsets
Journal of the ACM (JACM)
Information geometry on hierarchy of probability distributions
IEEE Transactions on Information Theory
Term associations in query expansion: a structural linguistic perspective
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.00 |
The classical bag-of-word models for information retrieval (IR) fail to capture contextual associations between words. In this article, we propose to investigate pure high-order dependence among a number of words forming an unseparable semantic entity, that is, the high-order dependence that cannot be reduced to the random coincidence of lower-order dependencies. We believe that identifying these pure high-order dependence patterns would lead to a better representation of documents and novel retrieval models. Specifically, two formal definitions of pure dependence—unconditional pure dependence (UPD) and conditional pure dependence (CPD)—are defined. The exact decision on UPD and CPD, however, is NP-hard in general. We hence derive and prove the sufficient criteria that entail UPD and CPD, within the well-principled information geometry (IG) framework, leading to a more feasible UPD/CPD identification procedure. We further develop novel methods for extracting word patterns with pure high-order dependence. Our methods are applied to and extensively evaluated on three typical IR tasks: text classification and text retrieval without and with query expansion.