Two Odds-Radio-Based Text Classification Algorithms

  • Authors:
  • Zhi-Hong Deng;Shi-Wei Tang;Dong-Qing Yang;Ming Zhang;Xiao-Bin Wu;Meng Yang

  • Affiliations:
  • -;-;-;-;-;-

  • Venue:
  • WISEW '02 Proceedings of the Third International Conference on Web Information Systems Engineering (Workshops) - (WISEw'02)
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Since 1990's, the exponential growth of theseWeb documents has led to a great deal of interestin developing efficient tools and software toassist users in finding relevant information. Textclassification has been proved to be useful inhelping organize and search text information onthe Web. Although there have been existed anumber of text classification algorithms, most ofthem are either inefficient or too complex. In thispaper we present two Odds-Radio-Based textclassification algorithms, which are called ORand TF*OR respectively. We have evaluated ouralgorithm on two text collections and compared itagainst k-NN and SVM. Experimental resultsshow that OR and TF*OR are competitive withk-NN and SVM. Furthermore, OR and TF*OR ismuch simpler and faster than them. The resultsalso indicate that it is not TF but relevancefactors derived from Odds Radio that play thedecisive role in document categorization.