Yago: a core of semantic knowledge
Proceedings of the 16th international conference on World Wide Web
Unsupervised methods for determining object and relation synonyms on the web
Journal of Artificial Intelligence Research
Open information extraction from the web
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Tensor Decompositions and Applications
SIAM Review
ICDM '11 Proceedings of the 2011 IEEE 11th International Conference on Data Mining
Identifying relations for open information extraction
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Factorizing YAGO: scalable machine learning for linked data
Proceedings of the 21st international conference on World Wide Web
Paper: Modeling by shortest data description
Automatica (Journal of IFAC)
PATTY: a taxonomy of relational patterns with semantic types
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Hi-index | 0.00 |
Open Information Extraction (Open IE) has gained increasing research interest in recent years. The first step in Open IE is to extract raw subject--predicate--object triples from the data. These raw triples are rarely usable per se, and need additional post-processing. To that end, we proposed the use of Boolean Tucker tensor decomposition to simultaneously find the entity and relation synonyms and the facts connecting them from the raw triples. Our method represents the synonym sets and facts using (sparse) binary matrices and tensor that can be efficiently stored and manipulated. We consider the presentation of the problem as a Boolean tensor decomposition as one of this paper's main contributions. To study the validity of this approach, we use a recent algorithm for scalable Boolean Tucker decomposition. We validate the results with empirical evaluation on a new semi-synthetic data set, generated to faithfully reproduce real-world data features, as well as with real-world data from existing Open IE extractor. We show that our method obtains high precision while the low recall can easily be remedied by considering the original data together with the decomposition.