The nature of statistical learning theory
The nature of statistical learning theory
Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
An Introduction to Variational Methods for Graphical Models
Machine Learning
Discovering Test Set Regularities in Relational Domains
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
A family of algorithms for approximate bayesian inference
A family of algorithms for approximate bayesian inference
Probabilistic classification and clustering in relational data
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Adapting Web information extraction knowledge via mining site-invariant and site-dependent features
ACM Transactions on Internet Technology (TOIT)
Intra-document structural frequency features for semi-supervised domain adaptation
Proceedings of the 17th ACM conference on Information and knowledge management
Mining employment market via text block detection and adaptive cross-domain information extraction
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Intimate learning: a novel approach for combining labelled and unlabelled data
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Joint training for open-domain extraction on the web: exploiting overlap when supervision is limited
Proceedings of the fourth ACM international conference on Web search and data mining
Collective Inference for Extraction MRFs Coupled with Symmetric Clique Potentials
The Journal of Machine Learning Research
Learning to adapt cross language information extraction wrapper
Applied Intelligence
Hi-index | 0.00 |
In probabilistic approaches to classification and information extraction, one typically builds a statistical model of words under the assumption that future data will exhibit the same regularities as the training data. In many data sets, however, there are scopelimited features whose predictive power is only applicable to a certain subset of the data. For example, in information extraction from web pages, word formatting may be indicative of extraction category in different ways on different web pages. The difficulty with using such features is capturing and exploiting the new regularities encountered in previously unseen data. In this paper, we propose a hierarchical probabilistic model that uses both local/scope-limited features, such as word formatting, and global features, such as word content. The local regularities are modeled as an unobserved random parameter which is drawn once for each local data set. This random parameter is estimated during the inference process and then used to perform classification with both the local and global features- a procedure which is akin to automatically retuning the classifier to the local regularities on each newly encountered web page. Exact inference is intractable and we present approximations via point estimates and variational methods. Empirical results on large collections of web data demonstrate that this method significantly improves performance from traditional models of global features alone.