Inferring decision trees using the minimum description length principle
Information and Computation
Computer systems that learn: classification and prediction methods from statistics, neural nets, machine learning, and expert systems
C4.5: programs for machine learning
C4.5: programs for machine learning
Machine learning, neural and statistical classification
Machine learning, neural and statistical classification
Mining high-speed data streams
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Data mining: concepts and techniques
Data mining: concepts and techniques
An Interval Classifier for Database Mining Applications
VLDB '92 Proceedings of the 18th International Conference on Very Large Data Bases
SPRINT: A Scalable Parallel Classifier for Data Mining
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
P-tree classification of yeast gene deletion data
ACM SIGKDD Explorations Newsletter
ACM SIGMOD Record
Feature Selection for Building Cost-Effective Data Stream Classifiers
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
ACM Transactions on Computer-Human Interaction (TOCHI)
A semi-random multiple decision-tree algorithm for mining data streams
Journal of Computer Science and Technology
Mining students' behavior in web-based learning programs
Expert Systems with Applications: An International Journal
Intervention Events Detection and Prediction in Data Streams
APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management
Event-based classification of social media streams
Proceedings of the 2nd ACM International Conference on Multimedia Retrieval
Hi-index | 0.00 |
Many organizations have large quantities of spatial data collected in various application areas, including remote sensing, geographical information systems (GIS), astronomy, computer cartography, environmental assessment and planning, etc. These data collections are growing rapidly and can therefore be considered as spatial data streams. For data stream classification, time is a major issue. However, these spatial data sets are too large to be classified effectively in a reasonable amount of time using existing methods. In this paper, we developed a new method for decision tree classification on spatial data streams using a data structure called Peano Count Tree (P-tree). The Peano Count Tree is a spatial data organization that provides a lossless compressed representation of a spatial data set and facilitates efficient classification and other data mining techniques. Using P-tree structure, fast calculation of measurements, such as information gain, can be achieved. We compare P-tree based decision tree induction classification and a classical decision tree induction method with respect to the speed at which the classifier can be built (and rebuilt when substantial amounts of new data arrive). Experimental results show that the P-tree method is significantly faster than existing classification methods, making it the preferred method for mining on spatial data streams.