LOF: identifying density-based local outliers
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient algorithms for mining outliers from large data sets
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
DEMIDS: a misuse detection system for database systems
Integrity and internal control information systems
Outlier detection for high dimensional data
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Using Text Categorization Techniques for Intrusion Detection
Proceedings of the 11th USENIX Security Symposium
Mining Strong Affinity Association Patterns in Data Sets with Skewed Support Distribution
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Misuse detection for information retrieval systems
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
A Hybrid Approach for Mining Maixmal Hyperclique Patterns
ICTAI '04 Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence
On off-topic access detection in information systems
Proceedings of the 14th ACM international conference on Information and knowledge management
Detection using clustering query results
ISI'06 Proceedings of the 4th IEEE international conference on Intelligence and Security Informatics
Hi-index | 0.00 |
This paper addresses the problem of detecting access to off-topic documents by exploiting user profiles. Existing methods usually store a few prototype off-topic documents as the profile and label their top nearest neighbors in the test set as suspects. This is based on the common assumption that nearby documents are from the same class. However, due to the inherent sparseness of high-dimensional space, a document and its nearest neighbors may not belong to the same class. To this end, we develop a hyperclique pattern based off-topic detection method for selecting which ones to label. Hyperclique patterns consider joint similarity among a set of objects instead of the traditional pairwise similarity. As a result, the objects from hypercliques are more reliable as seeds for classifying their neighbors. Indeed, our experimental results on real world document data favorably demonstrate the effectiveness of our technique over the existing methods in terms of detection precision.