Computational community interest for ranking
Proceedings of the 18th ACM conference on Information and knowledge management
Learning document aboutness from implicit user feedback and document structure
Proceedings of the 18th ACM conference on Information and knowledge management
A scalable machine-learning approach for semi-structured named entity recognition
Proceedings of the 19th international conference on World wide web
Search Engine Query Clustering Using Top-k Search Results
WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
Extracting search-focused key n-grams for relevance ranking in web search
Proceedings of the fifth ACM international conference on Web search and data mining
Selecting keywords to represent web pages using Wikipedia information
Proceedings of the 18th Brazilian symposium on Multimedia and the web
Identifying salient entities in web pages
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.00 |
The problem of automatically extracting the most interesting and relevant keyword phrases in a document has been studied extensively as it is crucial for a number of applications. These applications include contextual advertising, automatic text summarization, and user-centric entity detection systems. All these applications can potentially benefit from a successful solution as it enables computational efficiency (by decreasing the input size), noise reduction, or overall improved user satisfaction.In this paper, we study this problem and focus on improving the overall quality of user-centric entity detection systems. First, we review our concept extraction technique, which relies on search engine query logs. We then define a new feature space to represent the interestingness of concepts, and describe a new approach to estimate their relevancy for a given context. We utilize click through data obtained from a large scale user-centric entity detection system - Contextual Shortcuts - to train a model to rank the extracted concepts, and evaluate the resulting model extensively again based on their click through data. Our results show that the learned model outperforms the baseline model, which employs similar features but whose weights are tuned carefully based on empirical observations, and reduces the error rate from 30.22% to 18.66%.