An unsupervised cascade learning scheme for 'cluster-theme keywords' structure extraction from scientific papers

Authors:
Feiliang Ren
Affiliations:
Northeastern University, People's Republic of China
Venue:
Journal of Information Science
Year:
2014

Citing 11
Cited 0

The automatic identification of stop words

Journal of Information Science
Improving Term Extraction by System Combination Using Boosting

EMCL '01 Proceedings of the 12th European Conference on Machine Learning
Topic-focused multi-document summarization using an approximate oracle score

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Exploring content models for multi-document summarization

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Exploiting internal and external semantics for the clustering of short texts using world knowledge

Proceedings of the 18th ACM conference on Information and knowledge management
Identifying non-explicit citing sentences for citation-based summarization

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
A hybrid hierarchical model for multi-document summarization

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
A pilot study of opinion summarization in conversations

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Coherent citation-based summarization of scientific papers

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
A class of submodular functions for document summarization

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Fully abstractive approach to guided summarization

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

The large amount of scientific papers provides a convenient way for users to know the latest research progress of a specific research topic. However, the large volume and the diverse research themes hiding among these papers usually hinder users from conveniently locating the specific papers that they are interested in. To tackle this problem, we propose a novel unsupervised cascade learning scheme that aims to extract a 'cluster-theme keywords' structure from the related papers of a research topic so as to help users locate their research interests quickly. Our approach first selects some representative papers for a research topic. It then clusters these selected papers into several small clusters with the help of a domain ontology. It finally extracts some theme keywords for each cluster. Our approach not only greatly reduces the time-consuming and labour-intensive paper-seeking process for users, but also comprehensively displays the diverse themes of a research topic. We conducted extensive experiments to evaluate our proposed approach. The experimental results demonstrate the effectiveness of this approach, which produces promising results.