Discovering Relations Among Entities from XML Documents

Authors:
Yangyang Wu;Qing Lei;Wei Luo;Harou Yokota
Affiliations:
Department of Computer Science, Huaqiao University, Quanzhou Fujian, China;Department of Computer Science, Huaqiao University, Quanzhou Fujian, China;Department of Computer Science, Huaqiao University, Quanzhou Fujian, China;Department of Computer Science, Tokyo Institute of Technology, Tokyo, Japan
Venue:
MLDM '07 Proceedings of the 5th international conference on Machine Learning and Data Mining in Pattern Recognition
Year:
2007

Citing 7
Cited 0

Mining the Web for relations

Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Data mining: concepts and techniques

Data mining: concepts and techniques
An Information-Theoretic Definition of Similarity

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Preparations for Semantics-Based XML Mining

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Extracting Patterns and Relations from the World Wide Web

WebDB '98 Selected papers from the International Workshop on The World Wide Web and Databases
A Survey of Web Information Extraction Systems

IEEE Transactions on Knowledge and Data Engineering
LAX: an efficient approximate XML join based on clustered leaf nodes for XML data integration

BNCOD'05 Proceedings of the 22nd British National conference on Databases: enterprise, Skills and Innovation

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper addresses relation information extraction problem and proposes a method of discovering relations among entities which is buried in different nest structures of XML documents. The method first identifies and collects XML fragments that contain all types of entities given by users, then computes similarity between fragments based on semantics of their tags and their structures, and clusters fragments by similarity so that the fragments containing the same relation are clustered together, finally extracts relation instances and patterns of their occurrences from each cluster. The results of experiments show that the method can identify and extract relation information among given types of entities correctly from all kinds of XML documents with meaningful tags.