Two Odds-Radio-Based Text Classification Algorithms

Authors:
Zhi-Hong Deng;Shi-Wei Tang;Dong-Qing Yang;Ming Zhang;Xiao-Bin Wu;Meng Yang
Affiliations:
-;-;-;-;-;-
Venue:
WISEW '02 Proceedings of the Third International Conference on Web Information Systems Engineering (Workshops) - (WISEw'02)
Year:
2002

Citing 0
Cited 3

Document-Base Extraction for Single-Label Text Classification

DaWaK '08 Proceedings of the 10th international conference on Data Warehousing and Knowledge Discovery
A Hybrid Statistical Data Pre-processing Approach for Language-Independent Text Classification

ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
Hybrid DIAAF/RS: statistical textual feature selection for language-independent text classification

ICDM'10 Proceedings of the 10th industrial conference on Advances in data mining: applications and theoretical aspects

Quantified Score

Hi-index	0.00

Visualization

Abstract

Since 1990's, the exponential growth of theseWeb documents has led to a great deal of interestin developing efficient tools and software toassist users in finding relevant information. Textclassification has been proved to be useful inhelping organize and search text information onthe Web. Although there have been existed anumber of text classification algorithms, most ofthem are either inefficient or too complex. In thispaper we present two Odds-Radio-Based textclassification algorithms, which are called ORand TF*OR respectively. We have evaluated ouralgorithm on two text collections and compared itagainst k-NN and SVM. Experimental resultsshow that OR and TF*OR are competitive withk-NN and SVM. Furthermore, OR and TF*OR ismuch simpler and faster than them. The resultsalso indicate that it is not TF but relevancefactors derived from Odds Radio that play thedecisive role in document categorization.