Self-organization and associative memory: 3rd edition
Self-organization and associative memory: 3rd edition
Automatic text processing: the transformation, analysis, and retrieval of information by computer
Automatic text processing: the transformation, analysis, and retrieval of information by computer
An Evaluation of Statistical Approaches to Text Categorization
Information Retrieval
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Neural Networks: A Comprehensive Foundation
Neural Networks: A Comprehensive Foundation
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Similarity Model and Term Association For Document Categorization
DEXA '02 Proceedings of the 13th International Workshop on Database and Expert Systems Applications
RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
Topical N-Grams: Phrase and Topic Discovery, with an Application to Information Retrieval
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
The graph neural network model
IEEE Transactions on Neural Networks
Computational capabilities of graph neural networks
IEEE Transactions on Neural Networks
From concepts to concept lattice: a border algorithm for making covers explicit
ICFCA'08 Proceedings of the 6th international conference on Formal concept analysis
Supervised encoding of graph-of-graphs for classification and regression problems
INEX'09 Proceedings of the Focused retrieval and evaluation, and 8th international conference on Initiative for the evaluation of XML retrieval
Hi-index | 0.00 |
Most text mining methods are based on representing documents using a vector space model, commonly known as a bag of word model, where each document is modeled as a linear vector representing the occurrence of independent words in the text corpus. It is well known that using this vector-based representation, important information, such as semantic relationship among concepts, is lost. This paper proposes a novel text representation model called ConceptLink graph. The ConceptLink graph does not only represent the content of the document, but also captures some of its underlying semantic structure in terms of the relationships among concepts. The ConceptLink graph is constructed in two main stages. First, we find a set of concepts by clustering conceptually related terms using the self-organizing map method. Secondly, by mapping each document's content to concept, we generate a graph of concepts based on the occurrences of concepts using a singular value decomposition technique. The ConceptLink graph will overcome the keyword independence limitation in the vector space model to take advantage of the implicit concept relationships exhibit in all natural language texts. As an information-rich text representation model, the ConceptLink graph will advance text mining technology beyond feature-based to structure-based knowledge discovery. We will illustrate the ConceptLink graph method using samples generated from benchmark text mining dataset.