Algorithms for clustering data
Algorithms for clustering data
C4.5: programs for machine learning
C4.5: programs for machine learning
Incremental clustering and dynamic information retrieval
STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
CURE: an efficient clustering algorithm for large databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Automatic subspace clustering of high dimensional data for data mining applications
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
A comparative study of clustering methods
Future Generation Computer Systems - Special double issue on data mining
Fast hierarchical clustering and other applications of dynamic closest pairs
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Methodological and practical aspects of data mining
Information and Management
ACM Computing Surveys (CSUR)
Outlier detection for high dimensional data
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Models and issues in data stream systems
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Characterizing memory requirements for queries over continuous data streams
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Measuring similarity of interests for clustering web-users
ADC '01 Proceedings of the 12th Australasian database conference
Building Data Mining Applications for CRM
Building Data Mining Applications for CRM
BIRCH: A New Data Clustering Algorithm and Its Applications
Data Mining and Knowledge Discovery
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Categorizing Visitors Dynamically by Fast and Robust Clustering of Access Logs
WI '01 Proceedings of the First Asia-Pacific Conference on Web Intelligence: Research and Development
STING: A Statistical Information Grid Approach to Spatial Data Mining
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Knowledge discovery from users Web-page navigation
RIDE '97 Proceedings of the 7th International Workshop on Research Issues in Data Engineering (RIDE '97) High Performance Database Management for Large-Scale Applications
Streaming-Data Algorithms for High-Quality Clustering
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Efficient and Anonymous Web-Usage Mining for Web Personalization
INFORMS Journal on Computing
Adaptive web sites: an AI challenge
IJCAI'97 Proceedings of the 15th international joint conference on Artifical intelligence - Volume 1
Hi-index | 0.00 |
We start from an algorithm for on-line linear hierarchical classification for multidimensional data, using a centroid aggregation criterion. After evoking some real-life on-line settings where it can be used, we analyze it mathematically, in the framework of the Lance–Williams algorithms, proving that it does not have some useful properties: it is not monotonic, nor space-conserving. In order to use its on-line capabilities, we modify it and show that it becomes monotonic. While still not having the internal similarity-external dissimilarity property, the worst case classifications of the new algorithm are correctable with an additional small computational effort, on the overall taking O(n⋅k) time for n points and k classes. Experimental study confirm the theoretical improvements upon the initial algorithm. A theoretical and experimental comparison to other algorithms from the literature, shows that it is among the fastest and performs well.