Efficient Feature Selection via Analysis of Relevance and Redundancy
The Journal of Machine Learning Research
Frequent Substructure-Based Approaches for Classifying Chemical Compounds
IEEE Transactions on Knowledge and Data Engineering
Optimal assignment kernels for attributed molecular graphs
ICML '05 Proceedings of the 22nd international conference on Machine learning
A statistical approach to rule learning
ICML '06 Proceedings of the 23rd international conference on Machine learning
kFOIL: learning simple relational kernels
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Don't be afraid of simpler patterns
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Capacity Control for Partially Ordered Feature Sets
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
Latent structure pattern mining
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part II
Fast, effective molecular feature mining by local optimization
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
Learning from graph data by putting graphs on the lattice
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
Choosing a suitable feature representation for structured data is a non-trivial task due to the vast number of potential candidates. Ideally, one would like to pick a small, but informative set of structural features, each providing complementary information about the instances. We frame the search for a suitable feature set as a combinatorial optimization problem. For this purpose, we define a scoring function that favors features that are as dissimilar as possible to all other features. The score is used in a stochastic local search (SLS) procedure to maximize the diversity of a feature set. In experiments on small molecule data, we investigate the effectiveness of a forward selection approach with two different linear classification schemes.