Extracting difference information from multilingual wikipedia

Authors:
Yuya Fujiwara;Yu Suzuki;Yukio Konishi;Akiyo Nadamoto
Affiliations:
Konan University, Kobe, Hyogo, Japan;Nagoya University, Nagoya, Aichi, Japan;Konan University, Kobe, Hyogo, Japan;Konan University, Kobe, Hyogo, Japan
Venue:
APWeb'12 Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications
Year:
2012

Citing 9
Cited 3

The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Building a web thesaurus from web link structure

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Mining Domain-Specific Thesauri from Wikipedia: A Case Study

WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
Information arbitrage across multi-lingual Wikipedia

Proceedings of the Second ACM International Conference on Web Search and Data Mining
Is Wikipedia link structure different?

Proceedings of the Second ACM International Conference on Web Search and Data Mining
WikiRelate! computing semantic relatedness using wikipedia

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Wikipedia mining for an association web thesaurus construction

WISE'07 Proceedings of the 8th international conference on Web information systems engineering
Evaluating significance of historical entities based on tempo-spatial impacts analysis using Wikipedia link structure

Proceedings of the 22nd ACM conference on Hypertext and hypermedia
Adaptive ranking of search results by considering user's comprehension

Proceedings of the 4th International Conference on Uniquitous Information Management and Communication

Good quality complementary information for multilingual wikipedia

WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
Extracting lack of information on Wikipedia by comparing multilingual articles

Proceedings of the 14th International Conference on Information Integration and Web-based Applications & Services
Extracting complementary information from Wikipedia articles of different languages

International Journal of Business Intelligence and Data Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Wikipedia articles for a particular topic are written in many languages. When we select two articles which are about a single topic but which are written in different languages, the contents of these two articles are expected to be identical because of the Wikipedia policy. However, these contents are actually different, especially topics related to culture. In this paper, we propose a system to extract different Wikipedia information between that shown for Japan and that of other countries. An important technical problem is how to extract comparison target articles of Wikipedia. A Wikipedia article is written in different languages, with their respective linguistic structures. For example, "Cricket" is an important part of English culture, but the Japanese Wikipedia article related to cricket is too simple. Actually, it is only a single page. In contrast, the English version is substantial. It includes multiple pages. For that reason, we must consider which articles can be reasonably compared. Subsequently, we extract comparison target articles of Wikipedia based on a link graph and article structure. We implement our proposed method, and confirm the accuracy of difference extraction methods.