When classification becomes a problem: using branch-and-bound to improve classification efficiency

Authors:
Armand Prieditis;Moontae Lee
Affiliations:
Neustar Labs, Mountain View, CA;Neustar Labs, Mountain View, CA
Venue:
MLDM'13 Proceedings of the 9th international conference on Machine Learning and Data Mining in Pattern Recognition
Year:
2013

Citing 8
Cited 0

Depth-first iterative-deepening: an optimal admissible tree search

Artificial Intelligence
Principles of artificial intelligence

Principles of artificial intelligence
Very fast EM-based mixture model clustering using multiresolution kd-trees

Proceedings of the 1998 conference on Advances in neural information processing systems II
Multidimensional binary search trees used for associative searching

Communications of the ACM
Machine Discovery of Effective Admissible Heuristics

Machine Learning
Efficient Locally Weighted Polynomial Regression Predictions

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
RCV1: A New Benchmark Collection for Text Categorization Research

The Journal of Machine Learning Research
Foundations of Multidimensional and Metric Data Structures (The Morgan Kaufmann Series in Computer Graphics and Geometric Modeling)

Foundations of Multidimensional and Metric Data Structures (The Morgan Kaufmann Series in Computer Graphics and Geometric Modeling)

Quantified Score

Hi-index	0.00

Visualization

Abstract

In a typical machine learning classification task there are two phases: training and prediction. This paper focuses on improving the efficiency of the prediction phase. When the number of classes is low, linear search among the classes is an efficient way to find the most likely class. However, when the number of classes is high, linear search is inefficient. For example, some applications such as geolocation or time-based classification might require millions of subclasses to fit the data. Specifically, this paper describes a branch-and-bound method to search for the most likely class where the training examples can be partitioned into thousands of subclasses. To get some idea of the performance of branch-and-bound classification, we generated a synthetic set of random trees comprising billions of classes and evaluated branch-and-bound classification. Our results show that branch-and-bound classification is effective when the number of classes is large. Specifically, branch-and-bound improves search efficiency logarithmically.