Summarizing definition from Wikipedia

Authors:
Shiren Ye;Tat-Seng Chua;Jie Lu
Affiliations:
National University of Singapore;National University of Singapore;National University of Singapore
Venue:
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Year:
2009

Citing 14
Cited 8

The vocabulary problem in human-system communication

Communications of the ACM
The use of MMR, diversity-based reranking for reordering documents and producing summaries

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Summarizing scientific articles: experiments with relevance and rhetorical status

Computational Linguistics - Summarization
Efficiently computed lexical chains as an intermediate representation for automatic text summarization

Computational Linguistics - Summarization
The automated acquisition of topic signatures for text summarization

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
An effective approach to document retrieval via utilizing WordNet and recognizing phrases

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Generic soft pattern models for definitional question answering

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Methods for automatically evaluating answers to complex questions

Information Retrieval
Resource analysis for question answering

ACLdemo '04 Proceedings of the ACL 2004 on Interactive poster and demonstration sessions
The Pyramid Method: Incorporating human content selection variation in summarization evaluation

ACM Transactions on Speech and Language Processing (TSLP)
Interesting nuggets and their impact on definitional question answering

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Document concept lattice for text understanding and summarization

Information Processing and Management: an International Journal
LexRank: graph-based lexical centrality as salience in text summarization

Journal of Artificial Intelligence Research
Computing semantic relatedness using Wikipedia-based explicit semantic analysis

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence

From text question-answering to multimedia QA on web-scale media resources

LS-MMRM '09 Proceedings of the First ACM workshop on Large-scale multimedia retrieval and mining
Function-based question classification for general QA

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Learning web query patterns for imitating Wikipedia articles

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Mining wikipedia and yahoo! answers for question expansion in opinion QA

PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
A web knowledge based approach for complex question answering

AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology
Contextual question answering for the health domain

Journal of the American Society for Information Science and Technology
Multimedia encyclopedia construction by mining web knowledge

Signal Processing
WHAD: Wikipedia historical attributes data

Language Resources and Evaluation

Quantified Score

Hi-index	0.01

Visualization

Abstract

Wikipedia provides a wealth of knowledge, where the first sentence, infobox (and relevant sentences), and even the entire document of a wiki article could be considered as diverse versions of summaries (definitions) of the target topic. We explore how to generate a series of summaries with various lengths based on them. To obtain more reliable associations between sentences, we introduce wiki concepts according to the internal links in Wikipedia. In addition, we develop an extended document concept lattice model to combine wiki concepts and non-textual features such as the outline and infobox. The model can concatenate representative sentences from non-overlapping salient local topics for summary generation. We test our model based on our annotated wiki articles which topics come from TREC-QA 2004--2006 evaluations. The results show that the model is effective in summarization and definition QA.