Integrated segmentation and recognition of hand-printed numerals
NIPS-3 Proceedings of the 1990 conference on Advances in neural information processing systems 3
C4.5: programs for machine learning
C4.5: programs for machine learning
The nature of statistical learning theory
The nature of statistical learning theory
Solving the multiple instance problem with axis-parallel rectangles
Artificial Intelligence
A framework for multiple-instance learning
NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Content-Based Image Retrieval Using Multiple-Instance Learning
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Multiple-Instance Learning for Natural Scene Classification
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Solving the Multiple-Instance Problem: A Lazy Learning Approach
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Eliminating noisy information in Web pages for data mining
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Web-page classification through summarization
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Perceptrons: An Introduction to Computational Geometry
Perceptrons: An Introduction to Computational Geometry
Thumbs up?: sentiment classification using machine learning techniques
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Sensitive webpage classification for content advertising
Proceedings of the 1st international workshop on Data mining and audience intelligence for advertising
Blogger-centric contextual advertising
Proceedings of the 18th ACM conference on Information and knowledge management
Blogger-Centric Contextual Advertising
Expert Systems with Applications: An International Journal
A site oriented method for segmenting web pages
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Indices of novelty for emerging topic detection
Information Processing and Management: an International Journal
Multimedia Tools and Applications
A static and dynamic recommendations system for best practice networks
HCI'13 Proceedings of the 15th international conference on Human-Computer Interaction: users and contexts of use - Volume Part III
Hi-index | 0.00 |
Contextual advertising on web pages has become very popular recently and it poses its own set of unique text mining challenges. Often advertisers wish to either target (or avoid) some specific content on web pages which may appear only in a small part of the page. Learning for these targeting tasks is difficult since most training pages are multi-topic and need expensive human labeling at the sub-document level for accurate training. In this paper we investigate ways to learn for sub-document classification when only page level labels are available - these labels only indicate if the relevant content exists in the given page or not. We propose the application of multiple-instance learning to this task to improve the effectiveness of traditional methods. We apply sub-document classification to two different problems in contextual advertising. One is "sensitive content detection" where the advertiser wants to avoid content relating to war, violence, pornography, etc. even if they occur only in a small part of a page. The second problem involves opinion mining from review sites - the advertiser wants to detect and avoid negative opinion about their product when positive, negative and neutral sentiments co-exist on a page. In both these scenarios we present experimental results to show that our proposed system is able to get good block level labeling for free and improve the performance of traditional learning methods.