On the Optimality of the Simple Bayesian Classifier under Zero-One Loss
Machine Learning - Special issue on learning with probabilistic representations
Using analytic QP and sparseness to speed training of support vector machines
Proceedings of the 1998 conference on Advances in neural information processing systems II
Towards scalable support vector machines using squashing
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Local sparsity control for naive Bayes with extreme misclassification costs
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Unity: relevance feedback using user query logs
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
A statistical approach to crosslingual natural language tasks
Journal of Algorithms
The ineffectiveness of within-document term frequency in text classification
Information Retrieval
Practical lessons of data mining at Yahoo!
Proceedings of the 18th ACM conference on Information and knowledge management
Improving document clustering in a learned concept space
Information Processing and Management: an International Journal
Recommending ephemeral items at web scale
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Removing smoothing from naive bayes text classifier
WAIM'05 Proceedings of the 6th international conference on Advances in Web-Age Information Management
User participation prediction in online forums
EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Text analysis for detecting terrorism-related articles on the web
Journal of Network and Computer Applications
Hi-index | 0.00 |
Naive Bayes classifier has long been used for text categorization tasks. Its sibling from the unsupervised world, the probabilistic mixture of multinomial models, has likewise been successfully applied to text clustering problems. Despite the strong independence assumptions that these models make, their attractiveness come from low computational cost, relatively low memory consumption, ability to handle heterogeneous features and multiple classes, and often competitiveness with the top of the line models. Recently, there has been several attempts to alleviate the problems of Naive Bayes by performing heuristic feature transformations, such as IDF, normalization by the length of the documents and taking the logarithms of the counts. We justify the use of these techniques and apply them to two problems: classification of products in Yahoo! Shopping and clustering the vectors of collocated terms in user queries to Yahoo! Search. The experimental evaluation allows us to draw conclusions about the promise that these transformations carry with regard to alleviating the strong assumptions of the multinomial model.