Learning of semantic sibling group hierarchies - K-means vs. bi-secting-K-means

Authors:
Marko Brunzel
Affiliations:
DFKI GmbH - German Research Center for Artificial Intelligence and Otto-von-Guericke Universität Magdeburg, Germany
Venue:
DaWaK'07 Proceedings of the 9th international conference on Data Warehousing and Knowledge Discovery
Year:
2007

Citing 9
Cited 3

Exploiting Structure for Intelligent Web Search

HICSS '01 Proceedings of the 34th Annual Hawaii International Conference on System Sciences ( HICSS-34)-Volume 4 - Volume 4
Term Weighting Approaches in Automatic Text Retrieval

Term Weighting Approaches in Automatic Text Retrieval
Automatic acquisition of hyponyms from large text corpora

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Learning by googling

ACM SIGKDD Explorations Newsletter
Discovering semantic sibling associations from web documents with XTREEM-SP

DaWaK'06 Proceedings of the 8th international conference on Data Warehousing and Knowledge Discovery
RELFIN – topic discovery for ontology enhancement and annotation

ESWC'05 Proceedings of the Second European conference on The Semantic Web: research and Applications
Discovering semantic sibling groups from web documents with XTREEM-SG

EKAW'06 Proceedings of the 15th international conference on Managing Knowledge in a World of Networks
Proceedings of the First international conference on Knowledge Discovery from XML Documents

KDXD'06 Proceedings of the First international conference on Knowledge Discovery from XML Documents
Discovering multi terms and co-hyponymy from XHTML documents with XTREEM

KDXD'06 Proceedings of the First international conference on Knowledge Discovery from XML Documents

Discovering Groups of Sibling Terms from Web Documents with XTREEM-SG

Journal on Data Semantics XI
The XTREEM Methods for Ontology Learning from Web Documents

Proceedings of the 2008 conference on Ontology Learning and Population: Bridging the Gap between Text and Knowledge
A hybrid approach for learning concept hierarchy from Malay text using artificial immune network

Natural Computing: an international journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

The discovery of semantically associated groups of terms is important for many applications of text understanding, including document vectorization for text mining, semi-automated ontology extension from documents and ontology engineering with help of domain-specific texts. In [3], we have proposed a method for the discovery of such terms and shown that its performance is superior to other methods for the same task. However, we have observed that (a) the approach is sensitive to the term clustering method and (b) the performance improves with the size of the results'list, thus incurring higher human overhead in the postprocessing phase. In this study, we address these issues by proposing the delivery of a hierarchically organized output, computed with Bisecting K-Means. We compared the results of the new algorithm with those delivered by the original method, which used K-Means using two ontologies as gold standards.