SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Syntactic clustering of the Web
Selected papers from the sixth international conference on World Wide Web
Cumulated gain-based evaluation of IR techniques
ACM Transactions on Information Systems (TOIS)
Machine Learning
Impedance coupling in content-targeted advertising
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Finding advertising keywords on web pages
Proceedings of the 15th international conference on World Wide Web
Finding near-duplicate web pages: a large-scale evaluation of algorithms
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
A semantic approach to contextual advertising
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Just-in-time contextual advertising
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
A noisy-channel approach to contextual advertising
Proceedings of the 1st international workshop on Data mining and audience intelligence for advertising
De-duping URLs via rewrite rules
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining search engine query logs via suggestion sampling
Proceedings of the VLDB Endowment
Estimating the impressionrank of web pages
Proceedings of the 18th international conference on World wide web
Nearest-neighbor caching for content-match applications
Proceedings of the 18th international conference on World wide web
Learning URL patterns for webpage de-duplication
Proceedings of the third ACM international conference on Web search and data mining
Mining taxonomies from web menus: rule-based concepts and algorithms
ICWE'13 Proceedings of the 13th international conference on Web Engineering
Hi-index | 0.00 |
In Contextual advertising, textual ads relevant to the content in a webpage are embedded in the page. Content keywords are extracted offline by crawling webpages and then stored in an index for fast serving. Given a page, ad selection involves index lookup, computing similarity between the keywords of the page and those of candidate ads and returning the top-k scoring ads. In this approach, ad relevance can suffer in two scenarios. First, since page-ad similarity is computed using keywords extracted only from that particular page, a few non pertinent keywords can skew ad selection. Second, requesting page may not be present in the index but we still need to serve relevant ads. We propose a novel mechanism to mitigate these problems in the same framework. The basic idea is to enrich keywords of a particular page with keywords from other but "similar" pages. The scheme involves learning a website specific hierarchy from (page, URL) pairs of the website. Next, keywords are populated on the nodes via successive top-down and bottom-up iterations over the hierarchy. We evaluate our approach on three data sets, one small human labeled set and two large-scale sets from Yahoo's contextual advertising system. Empirical evaluation show that ads fetched by enriching keywords has 2-3% higher nDCG compared to ads fetched based on a recent semantic approach even though the index size of our approach is 7 times less than the index size of semantic approach. Evaluation over pages which are not present in the index shows that ads fetched by our method has 6-7% higher nDCG compared to ads fetched based on a recent approach which uses first N bytes of the page content. Scalability is demonstrated via map-reduce adoption of our method and training on a large data set of 220 million pages from 95,104 websites.