Automatic free-text-tagging of online news archives

Authors:
Richárd Farkas;Gábor Berend;István Hegedűs;András Kárpáti;Balázs Krich
Affiliations:
Hungarian Academy of Sciences, Hungary, email: rfarkas@inf.u-szeged.hu;University of Szeged, Hungary, email: {berendg,hegedusi}@inf.u-szeged.hu;University of Szeged, Hungary, email: {berendg,hegedusi}@inf.u-szeged.hu;Origo Ldt., Hungary, email: {karpati.andras,krich.balazs}@origo.hu;Origo Ldt., Hungary, email: {karpati.andras,krich.balazs}@origo.hu
Venue:
Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
Year:
2010

Citing 13
Cited 0

KEA: practical automatic keyphrase extraction

Proceedings of the fourth ACM conference on Digital libraries
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Domain-Specific Keyphrase Extraction

IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
AutoTag: a collaborative approach to automated tag assignment for weblog posts

Proceedings of the 15th international conference on World Wide Web
Ontologies are us: A unified model of social networks and semantics

Web Semantics: Science, Services and Agents on the World Wide Web
Web-Based Lemmatisation of Named Entities

TSD '08 Proceedings of the 11th international conference on Text, Speech and Dialogue
Extracting key terms from noisy and multitheme documents

Proceedings of the 18th international conference on World wide web
WikiRelate! computing semantic relatedness using wikipedia

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Using information content to evaluate semantic similarity in a taxonomy

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
Hunmorph: open source word analysis

Software '05 Proceedings of the Workshop on Software
Coherent keyphrase extraction via web mining

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Clustering to find exemplar terms for keyphrase extraction

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
A multilingual named entity recognition system using boosting and c4.5 decision tree learning algorithms

DS'06 Proceedings of the 9th international conference on Discovery Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we shall introduce the problem of free-text-tagging of online news archives. From an application point of view, it has many benefits for online news portals and on the other hand, the task has unique characteristics compared to existing approaches for free-text-tagging. We shall describe our system, which was developed for the archive (consisting of 370 thousand articles) of the most visited Hungarian news portal www.origo.hu, along with research questions encountered and solved during our task. As the evaluation of tagging is not straightforward at the end of the project the news company manually investigated the tagging of the automatic system which yielded an F-measure of 71.9.