Detecting Macro-patterns in the European Mediasphere

Authors:
Ilias Flaounas;Marco Turchi;Nello Cristianini
Affiliations:
-;-;-
Venue:
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 03
Year:
2009

Citing 8
Cited 1

Phrase-Based Statistical Machine Translation

KI '02 Proceedings of the 25th Annual German Conference on AI: Advances in Artificial Intelligence
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Statistical phrase-based translation

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data (Data-Centric Systems and Applications)

Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data (Data-Centric Systems and Applications)
Lydia: a system for the large scale analysis of natural language text

Lydia: a system for the large scale analysis of natural language text
Tracking and summarizing news on a daily basis with Columbia's Newsblaster

HLT '02 Proceedings of the second international conference on Human Language Technology Research
Moses: open source toolkit for statistical machine translation

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Learning performance of a machine translation system: a statistical and computational analysis

StatMT '08 Proceedings of the Third Workshop on Statistical Machine Translation

Learning readers' news preferences with support vector machines

ICANNGA'11 Proceedings of the 10th international conference on Adaptive and natural computing algorithms - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

The analysis of the contents of news outlets has been the focus of social scientists for a long time. However, content analysis is often performed on hand-coded documents, which limits the size of the data accessible to the investigation and consequently limits the possibility of detecting macro-trends. The use of text categorisation, clustering and statistical machine translation (SMT) enables us to operate automatically on vast amounts of news items, and consequently to analyse patterns in the content of outlets in different languages, over long time periods. We report on experiments involving hundreds of European media in 22 different languages, demonstrating how it is possible to detect similarities and differences between outlets, and between countries, based on the contents of their articles.