Automatic text structuring and summarization
Information Processing and Management: an International Journal - Special issue: methods and tools for the automatic construction of hypertext
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Summarizing text documents: sentence selection and evaluation metrics
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
New Methods in Automatic Extracting
Journal of the ACM (JACM)
Authoritative sources in a hyperlinked environment
Journal of the ACM (JACM)
Fundamentals of Computer Numerical Analysis
Fundamentals of Computer Numerical Analysis
Genetic Algorithms in Search, Optimization and Machine Learning
Genetic Algorithms in Search, Optimization and Machine Learning
Evolution strategies –A comprehensive introduction
Natural Computing: an international journal
Learning Algorithms for Keyphrase Extraction
Information Retrieval
Generating Text Summaries through the Relative Importance of Topics
IBERAMIA-SBIA '00 Proceedings of the International Joint Conference, 7th Ibero-American Conference on AI: Advances in Artificial Intelligence
Enhancing Preference-Based Anaphora Resolution with Genetic Algorithms
NLP '00 Proceedings of the Second International Conference on Natural Language Processing
Identifying topics by position
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Robust generic and query-based summarisation
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 2
Automatic evaluation of summaries using N-gram co-occurrence statistics
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Graph-Theoretic Techniques for Web Content Mining
Graph-Theoretic Techniques for Web Content Mining
Beyond SumBasic: Task-focused summarization with sentence simplification and lexical expansion
Information Processing and Management: an International Journal
Extractive summarization using supervised and semi-supervised learning
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Graph-based keyword extraction for single-document summarization
MMIES '08 Proceedings of the Workshop on Multi-source Multilingual Information Extraction and Summarization
Language independent extractive summarization
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 4
LexRank: graph-based lexical centrality as salience in text summarization
Journal of Artificial Intelligence Research
The automatic creation of literature abstracts
IBM Journal of Research and Development
Machine-made index for technical literature: an experiment
IBM Journal of Research and Development
Generating extracts with genetic algorithms
ECIR'03 Proceedings of the 25th European conference on IR research
Genetic algorithm based multi-document summarization
PRICAI'06 Proceedings of the 9th Pacific Rim international conference on Artificial intelligence
Cross-language document summarization based on machine translation quality prediction
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
A new approach to improving multilingual summarization using a genetic algorithm
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Multi-document summarization using A* search and discriminative training
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Applying regression models to query-focused multi-document summarization
Information Processing and Management: an International Journal
Text summarization and singular value decomposition
ADVIS'04 Proceedings of the Third international conference on Advances in Information Systems
Multiple documents summarization based on genetic algorithm
FSKD'06 Proceedings of the Third international conference on Fuzzy Systems and Knowledge Discovery
Diversity in genetic programming: an analysis of measures and correlation with fitness
IEEE Transactions on Evolutionary Computation
Hi-index | 0.00 |
The increasing trend of cross-border globalization and acculturation requires text summarization techniques to work equally well for multiple languages. However, only some of the automated summarization methods can be defined as "language-independent," i.e., not based on any language-specific knowledge. Such methods can be used for multilingual summarization, defined in Mani (Automatic summarization. Natural language processing. John Benjamins Publishing Company, Amsterdam, 2001) as "processing several languages, with a summary in the same language as input", but, their performance is usually unsatisfactory due to the exclusion of language-specific knowledge. Moreover, supervised machine learning approaches need training corpora in multiple languages that are usually unavailable for rare languages, and their creation is a very expensive and labor-intensive process. In this article, we describe cross-lingual methods for training an extractive single-document text summarizer called MUSE (MUltilingual Sentence Extractor)--a supervised approach, based on the linear optimization of a rich set of sentence ranking measures using a Genetic Algorithm. We evaluated MUSE's performance on documents in three different languages: English, Hebrew, and Arabic using several training scenarios. The summarization quality was measured using ROUGE-1 and ROUGE-2 Recall metrics. The results of the extensive comparative analysis showed that the performance of MUSE was better than that of the best known multilingual approach (TextRank) in all three languages. Moreover, our experimental results suggest that using the same sentence ranking model across languages results in a reasonable summarization quality, while saving considerable annotation efforts for the end-user. On the other hand, using parallel corpora generated by machine translation tools may improve the performance of a MUSE model trained on a foreign language. Comparative evaluation of an alternative optimization technique--Multiple Linear Regression--justifies the use of a Genetic Algorithm.