Exploiting neighborhood knowledge for single document summarization and keyphrase extraction

Authors:
Xiaojun Wan;Jianguo Xiao
Affiliations:
Peking University, Beijing, China;Peking University, Beijing, China
Venue:
ACM Transactions on Information Systems (TOIS)
Year:
2010

Citing 58
Cited 8

A trainable document summarizer

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Fab: content-based, collaborative recommendation

Communications of the ACM
The use of MMR, diversity-based reranking for reordering documents and producing summaries

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
KEA: practical automatic keyphrase extraction

Proceedings of the fourth ACM conference on Digital libraries
New Methods in Automatic Extracting

Journal of the ACM (JACM)
Authoritative sources in a hyperlinked environment

Journal of the ACM (JACM)
Efficient text summarization using lexical chains

Proceedings of the 5th international conference on Intelligent user interfaces
Improving browsing in digital libraries with keyphrase indexes

Decision Support Systems - From information retrieval to knowledge management: enabling technologies and best practices
OCELOT: a system for summarizing Web pages

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Applying summarization techniques for term selection in relevance feedback

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Generic text summarization using relevance measure and latent semantic analysis

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
A new approach to unsupervised text summarization

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Generic summaries for indexing in information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Text summarization via hidden Markov models

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Summarization as feature selection for text categorization

Proceedings of the tenth international conference on Information and knowledge management
Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases

ACM Computing Surveys (CSUR)
Using sentence-selection heuristics to rank text segments in TXTRACTOR

Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries
Modern Information Retrieval

Modern Information Retrieval
The use of unlabeled data to improve supervised learning for text summarization

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Generic summarization and keyphrase extraction using mutual reinforcement principle and sentence clustering

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Learning Algorithms for Keyphrase Extraction

Information Retrieval
Summarization beyond sentence extraction: a probabilistic approach to sentence compression

Artificial Intelligence
Summarizing scientific articles: experiments with relevance and rhetorical status

Computational Linguistics - Summarization
KPSpotter: a flexible information gain-based keyphrase extraction system

WIDM '03 Proceedings of the 5th ACM international workshop on Web information and data management
Generating natural language summaries from multiple on-line sources

Computational Linguistics - Special issue on natural language generation
Sentence reduction for automatic text summarization

ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Cut and paste based text summarization

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
The automated acquisition of topic signatures for text summarization

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Web-page classification through summarization

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Web page clustering enhanced by summarization

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Centroid-based summarization of multiple documents

Information Processing and Management: an International Journal
Automatic glossary extraction: beyond terminology identification

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Automatic evaluation of summaries using N-gram co-occurrence statistics

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Scalable collaborative filtering using cluster-based smoothing

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Web-page summarization using clickthrough data

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Topic themes for multi-document summarization

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Improving web search results using affinity graph

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Enriching the knowledge sources used in a maximum entropy part-of-speech tagger

EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
A language model approach to keyphrase extraction

MWE '03 Proceedings of the ACL 2003 workshop on Multiword expressions: analysis, acquisition and treatment - Volume 18
Improved automatic keyword extraction given more linguistic knowledge

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Summary in context: Searching versus browsing

ACM Transactions on Information Systems (TOIS)
Finding advertising keywords on web pages

Proceedings of the 15th international conference on World Wide Web
Thesaurus based automatic keyphrase indexing

Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Bayesian query-focused summarization

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Language model information retrieval with document expansion

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Summarizing email conversations with clue words

Proceedings of the 16th international conference on World Wide Web
CollabSum: exploiting multiple document clustering for collaborative single document summarizations

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
CollabRank: towards a collaborative approach to single-document keyphrase extraction

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Single document summarization with document expansion

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Single document keyphrase extraction using neighborhood knowledge

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
LexRank: graph-based lexical centrality as salience in text summarization

Journal of Artificial Intelligence Research
Domain-specific keyphrase extraction

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Document summarization using conditional random fields

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Coherent keyphrase extraction via web mining

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Automatic hypertext keyphrase detection

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
The automatic creation of literature abstracts

IBM Journal of Research and Development
Keyphrase extraction in scientific publications

ICADL'07 Proceedings of the 10th international conference on Asian digital libraries: looking back 10 years and forging new frontiers
CorePhrase: keyphrase extraction for document clustering

MLDM'05 Proceedings of the 4th international conference on Machine Learning and Data Mining in Pattern Recognition

To diversify or not to diversify entity summaries on RDF knowledge graphs?

ISMIS'11 Proceedings of the 19th international conference on Foundations of intelligent systems
Efficient keyword extraction for meaningful document perception

Proceedings of the 11th ACM symposium on Document engineering
Keyphrase extraction in biomedical publications using mesh and intraphrase word co-occurrence information

Proceedings of the ACM fifth international workshop on Data and text mining in biomedical informatics
GenDocSum+MCLR: Generic document summarization based on maximum coverage and less redundancy

Expert Systems with Applications: An International Journal
Keyphrase extraction through query performance prediction

Journal of Information Science
Formulation of document summarization as a 0-1 nonlinear programming problem

Computers and Industrial Engineering
Content coverage maximization on word networks for hierarchical topic summarization

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
The notion of diversity in graphical entity summarisation on semantic knowledge graphs

Journal of Intelligent Information Systems

Quantified Score

Hi-index	0.01

Visualization

Abstract

Document summarization and keyphrase extraction are two related tasks in the IR and NLP fields, and both of them aim at extracting condensed representations from a single text document. Existing methods for single document summarization and keyphrase extraction usually make use of only the information contained in the specified document. This article proposes using a small number of nearest neighbor documents to improve document summarization and keyphrase extraction for the specified document, under the assumption that the neighbor documents could provide additional knowledge and more clues. The specified document is expanded to a small document set by adding a few neighbor documents close to the document, and the graph-based ranking algorithm is then applied on the expanded document set to make use of both the local information in the specified document and the global information in the neighbor documents. Experimental results on the Document Understanding Conference (DUC) benchmark datasets demonstrate the effectiveness and robustness of our proposed approaches. The cross-document sentence relationships in the expanded document set are validated to be beneficial to single document summarization, and the word cooccurrence relationships in the neighbor documents are validated to be very helpful to single document keyphrase extraction.