Machine Learning - Special issue on learning with probabilistic representations
Machine Learning
Constructing Biological Knowledge Bases by Extracting Information from Text Sources
Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology
Hi-index | 0.00 |
A naive Bayes classifier was used to analyze gene behavior based on text data and presented as an entry for the 2002 KDD Cup, a data mining exercise to predict the behavior of the yeast S. Cerevisiae. The solution presented was based on the multinomial event model for text classification(McCallum & Nigam 1998) with a feature selection mechanism added. Despite this simple model, performance close to that of the best entries in the competition could be obtained, which were using more sophisticated techniques. It appears that seemingly minor effort in using prior knowledge to conflate the gene classes, as well as the previously described effectiveness of the naive Bayes method contributed to this success.