BorderFlow: A Local Graph Clustering Algorithm for Natural Language Processing

Authors:
Axel-Cyrille Ngonga Ngomo;Frank Schumacher
Affiliations:
Department of Business Information Systems, University of Leipzig, Leipzig, Germany D-04103;Department of Business Information Systems, University of Leipzig, Leipzig, Germany D-04103
Venue:
CICLing '09 Proceedings of the 10th International Conference on Computational Linguistics and Intelligent Text Processing
Year:
2009

Citing 10
Cited 2

Silhouettes: a graphical aid to the interpretation and validation of cluster analysis

Journal of Computational and Applied Mathematics
Efficient identification of Web communities

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Modern Information Retrieval

Modern Information Retrieval
Automatic word sense discrimination

Computational Linguistics - Special issue on word sense disambiguation
Overview and analysis of methodologies for building ontologies

The Knowledge Engineering Review
Expansion of multi-word terms for indexing and retrieval using morphology and syntax

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Dynamic extraction topic descriptors and discriminators: towards automatic context-based topic search

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Text Mining for Biology And Biomedicine

Text Mining for Biology And Biomedicine
Chinese whispers: an efficient graph clustering algorithm and its application to natural language processing problems

TextGraphs-1 Proceedings of the First Workshop on Graph Based Methods for Natural Language Processing
SIGNUM: a graph algorithm for terminology extraction

CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing

DBpedia SPARQL benchmark: performance assessment with real queries on real data

ISWC'11 Proceedings of the 10th international conference on The semantic web - Volume Part I
Introduction to linked data and its lifecycle on the web

RW'13 Proceedings of the 9th international conference on Reasoning Web: semantic technologies for intelligent data access

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we introduce BorderFlow, a novel local graph clustering algorithm, and its application to natural language processing problems. For this purpose, we first present a formal description of the algorithm. Then, we use BorderFlow to cluster large graphs and to extract concepts from word similarity graphs. The clustering of large graphs is carried out on graphs extracted from the Wikipedia Category Graph. The subsequent low-bias extraction of concepts is carried out on two data sets consisting of noisy and clean data. We show that BorderFlow efficiently computes clusters of high quality and purity. Therefore, BorderFlow can be integrated in several other natural language processing applications.