Enhanced hypertext categorization using hyperlinks
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
The use of bigrams to enhance text categorization
Information Processing and Management: an International Journal
Hierarchically Classifying Documents Using Very Few Words
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Representation and Learning in Information Retrieval
Representation and Learning in Information Retrieval
Using machine learning to improve information access
Using machine learning to improve information access
An ontology-based mining system for competitive intelligence in neuroscience
WImBI'06 Proceedings of the 1st WICI international conference on Web intelligence meets brain informatics
Hi-index | 0.00 |
Usually, in traditional text categorization systems based on Vector Space Model, there is no context information in a feature vector, which limited the performance of the system. To make use of more information, it is natural to select bi-gram feature in addition to unigram feature. However, the longer the feature is, the more important the feature selection algorithm is to get good balance in feature space This paper proposed two feature extraction methods which can get better feature balance for document categorization. Experiments show that our extended bi-gram feature improved system performance greatly.