Reasoning about naming systems
ACM Transactions on Programming Languages and Systems (TOPLAS)
Constraint satisfaction and debugging for interactive user interfaces
Constraint satisfaction and debugging for interactive user interfaces
A study on video browsing strategies
A study on video browsing strategies
The cubic mouse: a new device for three-dimensional input
Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Efficient algorithms for mining outliers from large data sets
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Data mining: concepts and techniques
Data mining: concepts and techniques
Proceedings of the 2002 ACM symposium on Applied computing
Discovery-Driven Exploration of OLAP Data Cubes
EDBT '98 Proceedings of the 6th International Conference on Extending Database Technology: Advances in Database Technology
OPTICS-OF: Identifying Local Outliers
PKDD '99 Proceedings of the Third European Conference on Principles of Data Mining and Knowledge Discovery
Algorithms for Mining Distance-Based Outliers in Large Datasets
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Finding Intensional Knowledge of Distance-Based Outliers
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
k-nearest Neighbor Classification on Spatial Data Streams Using P-trees
PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
An optimized approach for KNN text categorization using P-trees
Proceedings of the 2004 ACM symposium on Applied computing
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Detecting graph-based spatial outliers
Intelligent Data Analysis
A constant factor approximation algorithm for k-median clustering with outliers
Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Online spam-blog detection through blog search
Proceedings of the 17th ACM conference on Information and knowledge management
ODDC: outlier detection using distance distribution clustering
PAKDD'07 Proceedings of the 2007 international conference on Emerging technologies in knowledge discovery and data mining
Correlation-based detection of attribute outliers
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Detecting spam blogs from blog search results
Information Processing and Management: an International Journal
Hi-index | 0.00 |
"One person's noise is another person's signal". Outlier detection is used to clean up datasets and also to discover useful anomalies, such as criminal activities in electronic commerce, computer intrusion attacks, terrorist threats, agricultural pest infestations, etc. Thus, outlier detection is critically important in the information-based society. This paper focuses on finding outliers in large datasets using distance-based methods. First, to speedup outlier detections, we revise Knorr and Ng's distance-based outlier definition; second, a vertical data structure, instead of traditional horizontal structures, is adopted to facilitate efficient outlier detection further. We tested our methods against national hockey league dataset and show an order of magnitude of speed improvement compared to the contemporary distance-based outlier detection approaches.