Logical structure based semantic relationship extraction from semi-structured documents

  • Authors:
  • Zhang Kuo;Wu Gang;Li JuanZi

  • Affiliations:
  • Tsinghua University, Beijing, China;Tsinghua University, Beijing, China;Tsinghua University, Beijing, China

  • Venue:
  • Proceedings of the 15th international conference on World Wide Web
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Addressed in this paper is the issue of semantic relationship extraction from semi-structured documents. Many research efforts have been made so far on the semantic information extraction. However, much of the previous work focuses on detecting `isolated' semantic information by making use of linguistic analysis or linkage information in web pages and limited research has been done on extracting semantic relationship from the semi-structured documents. In this paper, we propose a method for semantic relationship extraction by using the logical information in the semi-structured document (semi-structured document usually has various types of structure information, e.g. a semi-structured document may be hierarchical laid out). To the best of our knowledge, extracting semantic relationships by using logical information has not been investigated previously. A probabilistic approach has been proposed in the paper. Features used in the probabilistic model have been defined.