Contextual keyword extraction by building sentences with crowdsourcing

Authors:
Soon Gill Hong;Sungho Shin;Mun Yong Yi
Affiliations:
Department of Knowledge Service Engineering, KAIST, Daejeon, Republic of Korea;Korea Institute of Science and Technology Information, Daejeon, Republic of Korea;Department of Knowledge Service Engineering, KAIST, Daejeon, Republic of Korea
Venue:
Multimedia Tools and Applications
Year:
2014

Citing 12
Cited 0

KEA: practical automatic keyphrase extraction

Proceedings of the fourth ACM conference on Digital libraries
Learning Algorithms for Keyphrase Extraction

Information Retrieval
Automatic Keyword Extraction Using Domain Knowledge

CICLing '01 Proceedings of the Second International Conference on Computational Linguistics and Intelligent Text Processing
Using Noun Phrase Heads to Extract Document Keyphrases

AI '00 Proceedings of the 13th Biennial Conference of the Canadian Society on Computational Studies of Intelligence: Advances in Artificial Intelligence
Improved automatic keyword extraction given more linguistic knowledge

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Designing games with a purpose

Communications of the ACM - Designing games with a purpose
Cheap and fast---but is it good?: evaluating non-expert annotations for natural language tasks

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Domain-specific keyphrase extraction

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Soylent: a word processor with a crowd inside

UIST '10 Proceedings of the 23nd annual ACM symposium on User interface software and technology
Crowdsourcing, collaboration and creativity

XRDS: Crossroads, The ACM Magazine for Students - Comp-YOU-Ter
Designing incentives for inexpert human raters

Proceedings of the ACM 2011 conference on Computer supported cooperative work
User-generated metadata in audio-visual collections

Proceedings of the 21st international conference companion on World Wide Web

Quantified Score

Hi-index	0.00

Visualization

Abstract

Automatic keyword extraction from documents has long been used and proven its usefulness in various areas. Crowdsourced tagging for multimedia resources has emerged and looks promising to a certain extent. Automatic approaches for unstructured data, automatic keyword extraction and crowdsourced tagging are efficient but they all suffer from the lack of contextual understanding. In this paper, we propose a new model of extracting key contextual terms from unstructured data, especially from documents, with crowdsourcing. The model consists of four sequential processes: (1) term selection by frequency, (2) sentence building, (3) revised term selection reflecting the newly built sentences, and (4) sentence voting. Online workers read only a fraction of a document and participated in sentence building and sentence voting processes, and key sentences were generated as a result. We compared the generated sentences to the keywords entered by the author and to the sentences generated by offline workers who read the whole document. The results support the idea that sentence building process can help selecting terms with more contextual meaning, closing the gap between keywords from automated approaches and contextual understanding required by humans.