Hyperclique pattern based off-topic detection

Authors:
Tianming Hu;Qingui Xu;Huaqiang Yuan;Jiali Hou;Chao Qu
Affiliations:
Department of Computer Science, DongGuan University of Technology, DongGuan, China;Department of Computer Science, DongGuan University of Technology, DongGuan, China;Department of Computer Science, DongGuan University of Technology, DongGuan, China;Department of Computer Science, DongGuan University of Technology, DongGuan, China;Department of Computer Science, DongGuan University of Technology, DongGuan, China
Venue:
APWeb/WAIM'07 Proceedings of the joint 9th Asia-Pacific web and 8th international conference on web-age information management conference on Advances in data and web management
Year:
2007

Citing 10
Cited 0

LOF: identifying density-based local outliers

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient algorithms for mining outliers from large data sets

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
DEMIDS: a misuse detection system for database systems

Integrity and internal control information systems
Outlier detection for high dimensional data

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Using Text Categorization Techniques for Intrusion Detection

Proceedings of the 11th USENIX Security Symposium
Mining Strong Affinity Association Patterns in Data Sets with Skewed Support Distribution

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Misuse detection for information retrieval systems

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
A Hybrid Approach for Mining Maixmal Hyperclique Patterns

ICTAI '04 Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence
On off-topic access detection in information systems

Proceedings of the 14th ACM international conference on Information and knowledge management
Detection using clustering query results

ISI'06 Proceedings of the 4th IEEE international conference on Intelligence and Security Informatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper addresses the problem of detecting access to off-topic documents by exploiting user profiles. Existing methods usually store a few prototype off-topic documents as the profile and label their top nearest neighbors in the test set as suspects. This is based on the common assumption that nearby documents are from the same class. However, due to the inherent sparseness of high-dimensional space, a document and its nearest neighbors may not belong to the same class. To this end, we develop a hyperclique pattern based off-topic detection method for selecting which ones to label. Hyperclique patterns consider joint similarity among a set of objects instead of the traditional pairwise similarity. As a result, the objects from hypercliques are more reliable as seeds for classifying their neighbors. Indeed, our experimental results on real world document data favorably demonstrate the effectiveness of our technique over the existing methods in terms of detection precision.