A ConceptLink graph for text structure mining

Authors:
Rowena Chau;Ah Chung Tsoi;Markus Hagenbuchner;Vincent C. S. Lee
Affiliations:
Monash University, Clayton Campus, Victoria, Australia;Hong Kong Baptist University, Hong Kong, Kowloon, Hong Kong;University of Wollongong, Wollongong, NSW, Australia;Monash University, Clayton Campus, Victoria, Australia
Venue:
ACSC '09 Proceedings of the Thirty-Second Australasian Conference on Computer Science - Volume 91
Year:
2009

Citing 13
Cited 1

Self-organization and associative memory: 3rd edition

Self-organization and associative memory: 3rd edition
Automatic text processing: the transformation, analysis, and retrieval of information by computer

Automatic text processing: the transformation, analysis, and retrieval of information by computer
An Evaluation of Statistical Approaches to Text Categorization

Information Retrieval
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Neural Networks: A Comprehensive Foundation

Neural Networks: A Comprehensive Foundation
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Similarity Model and Term Association For Document Categorization

DEXA '02 Proceedings of the 13th International Workshop on Database and Expert Systems Applications
RCV1: A New Benchmark Collection for Text Categorization Research

The Journal of Machine Learning Research
Topical N-Grams: Phrase and Topic Discovery, with an Application to Information Retrieval

ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
The graph neural network model

IEEE Transactions on Neural Networks
Computational capabilities of graph neural networks

IEEE Transactions on Neural Networks
From concepts to concept lattice: a border algorithm for making covers explicit

ICFCA'08 Proceedings of the 6th international conference on Formal concept analysis

Supervised encoding of graph-of-graphs for classification and regression problems

INEX'09 Proceedings of the Focused retrieval and evaluation, and 8th international conference on Initiative for the evaluation of XML retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most text mining methods are based on representing documents using a vector space model, commonly known as a bag of word model, where each document is modeled as a linear vector representing the occurrence of independent words in the text corpus. It is well known that using this vector-based representation, important information, such as semantic relationship among concepts, is lost. This paper proposes a novel text representation model called ConceptLink graph. The ConceptLink graph does not only represent the content of the document, but also captures some of its underlying semantic structure in terms of the relationships among concepts. The ConceptLink graph is constructed in two main stages. First, we find a set of concepts by clustering conceptually related terms using the self-organizing map method. Secondly, by mapping each document's content to concept, we generate a graph of concepts based on the occurrences of concepts using a singular value decomposition technique. The ConceptLink graph will overcome the keyword independence limitation in the vector space model to take advantage of the implicit concept relationships exhibit in all natural language texts. As an information-rich text representation model, the ConceptLink graph will advance text mining technology beyond feature-based to structure-based knowledge discovery. We will illustrate the ConceptLink graph method using samples generated from benchmark text mining dataset.