Hierarchical training of multiple SVMs for personalized web filtering

Authors:
Maike Erdmann;Duc Dung Nguyen;Tomoya Takeyoshi;Gen Hattori;Kazunori Matsumoto;Chihiro Ono
Affiliations:
KDDI R&D Laboratories, Saitama, Japan;Vietnam Academy of Science and Technology, Hanoi, Vietnam;KDDI R&D Laboratories, Saitama, Japan;KDDI R&D Laboratories, Saitama, Japan;KDDI R&D Laboratories, Saitama, Japan;KDDI R&D Laboratories, Saitama, Japan
Venue:
PRICAI'12 Proceedings of the 12th Pacific Rim international conference on Trends in Artificial Intelligence
Year:
2012

Citing 10
Cited 0

Learning from hints in neural networks

Journal of Complexity
Support-Vector Networks

Machine Learning
Incremental Learning with Support Vector Machines

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
A Comparative Study of Methods for Transductive Transfer Learning

ICDMW '07 Proceedings of the Seventh IEEE International Conference on Data Mining Workshops
SVM Classification for Large Data Sets by Considering Models of Classes Distribution

MICAI '07 Proceedings of the 2007 Sixth Mexican International Conference on Artificial Intelligence, Special Session
A model of inductive bias learning

Journal of Artificial Intelligence Research
A Survey on Transfer Learning

IEEE Transactions on Knowledge and Data Engineering
Condensed vector machines: learning fast machine for large data

IEEE Transactions on Neural Networks
LIBSVM: A library for support vector machines

ACM Transactions on Intelligent Systems and Technology (TIST)
Incremental training of support vector machines

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

The abundance of information published on the Internet makes filtering of hazardous Web pages a difficult yet important task. Supervised learning methods such as Support Vector Machines can be used to identify hazardous Web content. However, scalability is a big challenge, especially if we have to train multiple classifiers, since different policies exist on what kind of information is hazardous. We therefore propose a transfer learning approach called Hierarchical Training for Multiple SVMs. HTMSVM identifies common data among similar training sets and trains the common data sets first, in order to obtain initial solutions. These initial solutions then reduce the time for training the individual training sets without influencing classification accuracy. In an experiment, in which we trained five Web content filters with 80% of common and 20% of inconsistently labeled training examples, HTMSVM was able to predict hazardous Web pages with a training time of only 26% to 41% compared to LibSVM, but the same classification accuracy (more than 91%).