Retrieving informative content from web pages with conditional learning of support vector machines and semantic analysis

  • Authors:
  • Piotr Ładyżyński;Przemysław Grzegorzewski

  • Affiliations:
  • Faculty of Mathematics and Computer Science, Warsaw University of Technology, Warsaw, Poland;Faculty of Mathematics and Computer Science, Warsaw University of Technology, Warsaw, Poland and Faculty of Mathematics and Computer Science, Warsaw University of Technology, Warsaw, Poland

  • Venue:
  • ICAISC'12 Proceedings of the 11th international conference on Artificial Intelligence and Soft Computing - Volume Part II
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a new system which is able to extract informative content from the news pages and divide it into prescribed sections. The system is based on the machine learning classifier incorporating different kind of information (styles, linguistic information, structural information, content semantic analysis) and conditional learning. According to empirical results the suggested system seems to be a promising tool for extracting information from web.