Discovering semantic sibling groups from web documents with XTREEM-SG

  • Authors:
  • Marko Brunzel;Myra Spiliopoulou

  • Affiliations:
  • Otto-von-Guericke-University, Magdeburg;Otto-von-Guericke-University, Magdeburg

  • Venue:
  • EKAW'06 Proceedings of the 15th international conference on Managing Knowledge in a World of Networks
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

The acquisition of explicit semantics is still a research challenge. Approaches for the extraction of semantics focus mostly on learning hierarchical hypernym-hyponym relations. The extraction of co-hyponym and co-meronym sibling semantics is performed to a much lesser extent, though they are not less important in ontology engineering. In this paper we will describe and evaluate the XTREEM-SG (Xhtml TREE Mining – for Sibling Groups) approach on finding sibling semantics from semi-structured Web documents. XTREEM takes advantage of the added value of mark-up, available in web content, for grouping text siblings. We will show that this grouping is semantically meaningful. The XTREEM-SG approach has the advantage that it is domain and language independent; it does not rely on background knowledge, NLP software or training. In this paper we apply the XTREEM-SG approach and evaluate against the reference semantics from two golden standard ontologies. We investigate how variations on input, parameters and reference influence the obtained results on structuring a closed vocabulary on sibling relations. Earlier methods that evaluate sibling relations against a golden standard report a 14.18% F-measure value. Our method improves this number into 21.47%.