Improving Text Classification by Shrinkage in a Hierarchy of Classes
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Convex Optimization
Evaluation and extension of maximum entropy models with inequality constraints
EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Constructing informative priors using transfer learning
ICML '06 Proceedings of the 23rd international conference on Machine learning
A model of inductive bias learning
Journal of Artificial Intelligence Research
Maximum entropy distribution estimation with generalized regularization
COLT'06 Proceedings of the 19th annual conference on Learning Theory
Estimating rates of rare events with multiple hierarchies through scalable log-linear models
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Unsupervised ontology induction from text
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Multitask Sparsity via Maximum Entropy Discrimination
The Journal of Machine Learning Research
Temporal multi-hierarchy smoothing for estimating rates of rare events
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Hi-index | 0.00 |
We study the problem of simultaneously estimating several densities where the datasets are organized into overlapping groups, such as a hierarchy. For this problem, we propose a maximum entropy formulation, which systematically incorporates the groups and allows us to share the strength of prediction across similar datasets. We derive general performance guarantees, and show how some previous approaches, such as hierarchical shrinkage and hierarchical priors, can be derived as special cases. We demonstrate the proposed technique on synthetic data and in a real-world application to modeling the geographic distributions of species hierarchically grouped in a taxonomy. Specifically, we model the geographic distributions of species in the Australian wet tropics and Northeast New South Wales. In these regions, small numbers of samples per species significantly hinder effective prediction. Substantial benefits are obtained by combining information across taxonomic groups.