A Hierarchical Neural Network Document Classifier with Linguistic Feature Selection

Authors:
Chih-Ming Chen;Hahn-Ming Lee;Cheng-Wei Hwang
Affiliations:
Graduate Institute of Learning Technology, National Hualien University of Education, Hualien, Republic of China 970;Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, Republic of China 106;Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, Republic of China 106
Venue:
Applied Intelligence
Year:
2005

Citing 17
Cited 10

Automatic text processing: the transformation, analysis, and retrieval of information by computer

Automatic text processing: the transformation, analysis, and retrieval of information by computer
Retrieval algorithm effectiveness in a wide area network information filter

SIGIR '91 Proceedings of the 14th annual international ACM SIGIR conference on Research and development in information retrieval
Information retrieval: data structures and algorithms

Information retrieval: data structures and algorithms
An example-based mapping method for text categorization and retrieval

ACM Transactions on Information Systems (TOIS)
Fuzzy sets and fuzzy logic: theory and applications

Fuzzy sets and fuzzy logic: theory and applications
Neural network design

Neural network design
Neural Networks

Neural Networks
Neural Networks in Computer Intelligence

Neural Networks in Computer Intelligence
Data Mining: Introductory and Advanced Topics

Data Mining: Introductory and Advanced Topics
Representations for Genetic and Evolutionary Algorithms

Representations for Genetic and Evolutionary Algorithms
Hierarchical Text Categorization Using Neural Networks

Information Retrieval
Searching the Internet

IEEE Internet Computing
Information Retrieval on the World Wide Web

IEEE Internet Computing
Hierarchical Text Classification and Evaluation

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Neural networks for classification: a survey

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Stability of steepest descent with momentum for quadratic functions

IEEE Transactions on Neural Networks
Neighborhood based Levenberg-Marquardt algorithm for neural network training

IEEE Transactions on Neural Networks

An intelligent web-page classifier with fair feature-subset selection

Engineering Applications of Artificial Intelligence
A study of local and global thresholding techniques in text categorization

AusDM '06 Proceedings of the fifth Australasian conference on Data mining and analystics - Volume 61
Designing evolving user profile in e-CRM with dynamic clustering of Web documents

Data & Knowledge Engineering
Two novel feature selection approaches for web page classification

Expert Systems with Applications: An International Journal
Hierarchical fuzzy clustering decision tree for classifying recipes of ion implanter

Expert Systems with Applications: An International Journal
Using the absolute difference of term occurrence probabilities in binary text categorization

Applied Intelligence
Automatic folder allocation system using Bayesian-support vector machines hybrid classification approach

Applied Intelligence
Selection and impact of different topologies in multi-layered hierarchical fuzzy systems

Applied Intelligence
An enhanced Support Vector Machine classification framework by using Euclidean distance function for text document categorization

Applied Intelligence
Nonlinear transformation of term frequencies for term weighting in text categorization

Engineering Applications of Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this article, a neural network document classifier with linguistic feature selection and multi-category output is presented. It consists of a feature selection unit and a hierarchical neural network classification unit. In the feature selection unit, the candidate terms are extracted from some original documents by text processing techniques, and then the conformity and uniformity of each term are analyzed by an entropy function which can measure the significance of terms. Terms with high significance are selected as input features for training neural network document classifiers. In order to reduce the input dimensions, a composition mechanism of fuzzy relation is employed to identify synonyms. By this method, a synonym thesaurus can be constructed to reduce input dimensions. To simplify the learning scheme, the well-known back-propagation learning model is used to build proper hierarchical classification units. In our experiments, a product description database from an electronic commercial company is employed. The experimental results show that this classifier achieves sufficient accuracy to help human classification. It can save much manpower and work time classifying a large database.