Focused crawling: a new approach to topic-specific Web resource discovery
WWW '99 Proceedings of the eighth international conference on World Wide Web
LOF: identifying density-based local outliers
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient algorithms for mining outliers from large data sets
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
ACM SIGKDD Explorations Newsletter
Mining top-n local outliers in large databases
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Fast Outlier Detection in High Dimensional Spaces
PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Algorithms for Mining Distance-Based Outliers in Large Datasets
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Mining web content outliers using structure oriented weighting techniques and N-grams
Proceedings of the 2005 ACM symposium on Applied computing
Web outlier mining: Discovering outliers from web datasets
Intelligent Data Analysis
Proceedings of the 11th International Conference on Electronic Commerce
A comprehensive survey of numeric and symbolic outlier mining techniques
Intelligent Data Analysis
FindWDO: a k-nearest neighbors approach for detecting Web document outliers
ACST '08 Proceedings of the Fourth IASTED International Conference on Advances in Computer Science and Technology
Web content outlier mining through mathematical approach and trust rating
ACACOS'11 Proceedings of the 10th WSEAS international conference on Applied computer and applied computational science
Statistical approach for improving the quality of search results
ACACOS'11 Proceedings of the 10th WSEAS international conference on Applied computer and applied computational science
Detecting outlier sections in us congressional legislation
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Hybrid approach to web content outlier mining without query vector
DaWaK'05 Proceedings of the 7th international conference on Data Warehousing and Knowledge Discovery
International Journal of Computational Science and Engineering
Hi-index | 0.00 |
Outliers are data objects with different characteristics compared to other data objects. Exploring the diverse and dynamic web data for outliers is more interesting than finding outliers in numeric data sets. Interestingly, the existing web mining algorithms have concentrated on finding patterns that are frequent while discarding the less frequent ones that are likely to contain the outlying data. This paper refers to outliers present on the web as web outliers to distinguish them from traditional outliers. Web outliers are data objects that show significantly different characteristics than other web data. Although the presence of web outliers appears obvious, there is neither formal definition for web outliers nor algorithms for mining them. Secondly, traditional outlier mining algorithms designed solely for numeric data sets are inappropriate for mining web outliers. This paper establishes the presence of web outliers and discusses some practical applications of web outlier mining. Finally, we present taxonomy for web outliers and propose a general framework for mining web content out.