SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Neural networks for pattern recognition
Neural networks for pattern recognition
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: concepts and techniques
Data mining: concepts and techniques
Tri-plots: scalable tools for multidimensional data mining
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Machine Learning
Ensembling neural networks: many could be better than all
Artificial Intelligence
PKDD '97 Proceedings of the First European Symposium on Principles of Data Mining and Knowledge Discovery
MindReader: Querying Databases Through Multiple Examples
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
What Is the Nearest Neighbor in High Dimensional Spaces?
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Adaptive duplicate detection using learnable string similarity measures
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Similarity Search in Multimedia Databases
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Learning semantics-preserving distance metrics for clustering graphical data
MDM '05 Proceedings of the 6th international workshop on Multimedia data mining: mining integrated media and complex data
AutoDomainMine: a graphical data mining system for process optimization
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
On the marriage of Lp-norms and edit distance
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Learning the Relative Importance of Features in Image Data
ICDEW '07 Proceedings of the 2007 IEEE 23rd International Conference on Data Engineering Workshop
IBM Journal of Research and Development
Hi-index | 0.00 |
Analyzing complex scientific data, e.g., graphs and images, often requires comparison of features: regions on graphs, visual aspects of images and related metadata, some features being relatively more important. The notion of similarity for comparison is typically distance between data objects which could be expressed as distance between features. We refer to distance based on each feature as a component. Weights of components representing relative importance of features could be learned using distance function learning algorithms. However, it is seldom known which components optimize learning, given criteria such as accuracy, efficiency and simplicity. This is the problem we address. We propose and theoretically compare four component selection approaches: Maximal Path Traversal, Minimal Path Traversal, Maximal Path Traversal with Pruning and Minimal Path Traversal with Pruning. Experimental evaluation is conducted using real data from Materials Science, Nanotechnology and Bioinformatics. A trademarked software tool is developed as a highlight of this work.