Extracting keyphrase set with high diversity and coverage using structural SVM

Authors:
Weijian Ni;Tong Liu;Qingtian Zeng
Affiliations:
Shandong University of Science and Technology, Qingdao, Shandong, P.R. China;Shandong University of Science and Technology, Qingdao, Shandong, P.R. China;Shandong University of Science and Technology, Qingdao, Shandong, P.R. China
Venue:
APWeb'12 Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications
Year:
2012

Citing 19
Cited 0

WordNet: a lexical database for English

Communications of the ACM
KEA: practical automatic keyphrase extraction

Proceedings of the fourth ACM conference on Digital libraries
Learning Algorithms for Keyphrase Extraction

Information Retrieval
Large Margin Methods for Structured and Interdependent Output Variables

The Journal of Machine Learning Research
Improved automatic keyword extraction given more linguistic knowledge

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Finding advertising keywords on web pages

Proceedings of the 15th international conference on World Wide Web
A web-based kernel function for measuring the similarity of short text snippets

Proceedings of the 15th international conference on World Wide Web
Thesaurus based automatic keyphrase indexing

Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Document keyphrases as subject metadata: incorporating document key concepts in search results

Information Retrieval
Semantic text similarity using corpus-based word similarity and string similarity

ACM Transactions on Knowledge Discovery from Data (TKDD)
Accurate max-margin training for structured output spaces

Proceedings of the 25th international conference on Machine learning
Predicting diverse subsets using structural SVMs

Proceedings of the 25th international conference on Machine learning
Training structural svms with kernels using sampled cuts

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Extracting key terms from noisy and multitheme documents

Proceedings of the 18th international conference on World wide web
A ranking approach to keyphrase extraction

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Enhancing Keyword Search with a Keyphrase Index

Advances in Focused Retrieval
Cutting-plane training of structural SVMs

Machine Learning
Keyword extraction for social snippets

Proceedings of the 19th international conference on World wide web
Automatic keyphrase extraction via topic decomposition

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Keyphrase extraction plays an important role in automatic document understanding. In order to obtain concise and comprehensive information about the content of document, the keyphrases extracted from a given document should meet two requirements. First, the keyphrases should be diverse to each other so as to avoid carrying duplicated information. Second, every keyphrases should cover various aspects of the topics in the document so as to avoid unnecessary information loss. In this paper, we address the issue of automatic keyphrases extraction, giving the emphasis on the diversity and coverage of keyphrases which is generally ignored in most conventional keyphrase extraction approaches. Specifically, the issue is formulated as a subset learning problem in the framework of structural learning and structural SVM is employed to preform the task. Experiments on a scientific literature dataset show that our approach outperforms several state-of-the-art keyphrase extraction approaches, which verifies the benefits of explicit diversity and coverage enhancement.