Extracting comparative sentences from Korean text documents using comparative lexical patterns and machine learning techniques

Authors:
Seon Yang;Youngjoong Ko
Affiliations:
Dong-A University, Saha-gu, Busan, Korea;Dong-A University, Saha-gu, Busan, Korea
Venue:
ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Year:
2009

Citing 5
Cited 2

A maximum entropy approach to natural language processing

Computational Linguistics
Learning extraction patterns for subjective expressions

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Identifying comparative sentences in text documents

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Opinion mining of customer feedback data on the web

Proceedings of the 2nd international conference on Ubiquitous information management and communication
Mining comparative sentences and relations

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2

Identifying contradictory and contrastive relations between statements to outline web information on a given topic

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Extracting comparative entities and predicates from texts using comparative type classification

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes how to automatically identify Korean comparative sentences from text documents. This paper first investigates many comparative sentences referring to previous studies and then defines a set of comparative keywords from them. A sentence which contains one or more elements of the keyword set is called a comparative-sentence candidate. Finally, we use machine learning techniques to eliminate non-comparative sentences from the candidates. As a result, we achieved significant performance, an F1-score of 88.54%, in our experiments using various web documents.