Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
The Journal of Machine Learning Research
Impedance coupling in content-targeted advertising
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Query expansion using random walk models
Proceedings of the 14th ACM international conference on Information and knowledge management
Finding advertising keywords on web pages
Proceedings of the 15th international conference on World Wide Web
A new suffix tree similarity measure for document clustering
Proceedings of the 16th international conference on World Wide Web
Automatic hypertext keyphrase detection
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Lessons for the future from a decade of informedia video analysis research
CIVR'05 Proceedings of the 4th international conference on Image and Video Retrieval
Translating related words to videos and back through latent topics
Proceedings of the sixth ACM international conference on Web search and data mining
Hi-index | 0.00 |
With the proliferation of online distribution methods for videos, content owners require easier and more effective methods for monetization through advertising. Matching advertisements with related content has a significant impact on the effectiveness of the ads, but current methods for selecting relevant advertising keywords for videos are limited by reliance on manually supplied metadata. In this paper we study the feasibility of using text available from video content to obtain high quality keywords suitable for matching advertisements. In particular, we tap into three sources of text for ad keyword generation: production scripts, closed captioning tracks, and speech-to-text transcripts. We address several challenges associated with using such data. To overcome the high error rates prevalent in automatic speech recognition and the lack of an explicit structure to provide hints about which keywords are most relevant, we use statistical and generative methods to identify dominant terms in the source text. To overcome the sparsity of the data and resulting vocabulary mismatches between source text and the advertiser's chosen keywords, these terms are then expanded into a set of related keywords using related term mining methods. Our evaluations present a comprehensive analysis of the relative performance for these methods across a range of videos, including professionally produced films and popular videos from YouTube.