Classifying the Hungarian web

  • Authors:
  • András Kornai;Marc Krellenstein;Michael Mulligan;David Twomey;Fruzsina Veress;Alec Wysoker

  • Affiliations:
  • Metacarta Inc., Cambridge, MA;Reed-Elsevier Inc., Burlington, MA;divine Inc., Burlington, MA;CEHQ, Inc., Needham, MA;Teragram Corp., Boston, MA;deNovis Inc., One Cranberry Hill, Lexington, MA

  • Venue:
  • EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we present some lessons learned from building vizsla, the keyword search and topic classification system used on the largest Hungarian portal, [origo.hu]. Based on a simple statistical language, model, and the large-scale supporting evidence from vizsla, we argue that in topic classification only positive evidence matters.