Research on chinese sentence compression for the title generation

Authors:
Yonglei Zhang;Cheng Peng;Hongling Wang
Affiliations:
Natural Language Processing Lab, Soochow University, Suzhou, Jiangsu, China;Natural Language Processing Lab, Soochow University, Suzhou, Jiangsu, China;Natural Language Processing Lab, Soochow University, Suzhou, Jiangsu, China
Venue:
CLSW'12 Proceedings of the 13th Chinese conference on Chinese Lexical Semantics
Year:
2012

Citing 8
Cited 0

Summarization beyond sentence extraction: a probabilistic approach to sentence compression

Artificial Intelligence
Statistical sentence condensation using ambiguity packing and stochastic disambiguation methods for Lexical-Functional Grammar

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Large Margin Methods for Structured and Interdependent Output Variables

The Journal of Machine Learning Research
Improving nominal SRL in Chinese language with verbal SRL information and automatic predicate recognition

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Topic-Driven Multi-document Summarization

IALP '10 Proceedings of the 2010 International Conference on Asian Language Processing
Tree kernel-based semantic role labeling with enriched parse tree structure

Information Processing and Management: an International Journal
Unified Semantic Role Labeling for Verbal and Nominal Predicates in the Chinese Language

ACM Transactions on Asian Language Information Processing (TALIP)
Toward a Unified Framework for Standard and Update Multi-Document Summarization

ACM Transactions on Asian Language Information Processing (TALIP)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Automatic Title Generation means generating a title which can show the central information of the original text via natural language processing technologies. One method is by extracting a sentence which represents the original text's central information and then compressing it to a short sentence as the title, in which the core technology is the sentence compression. But the research of Chinese sentence compression has not carried out, it is mainly facing the following difficulties: lacking of the corpus, suffering from the poor performance of Chinese word segmentation and parsing, and having no unified automatic evaluation metric. This paper realizes a Chinese sentence compression method through simply shorting a sentence by deleting words or constituents which is main practice is by learning a subtree from the source parsing tree of a sentence, and then uses the manual and automatic evaluations to evaluate the sentence compression performance. The experimental results show that the method and evaluation metrics used in this paper are valid and effective.