Inductive learning algorithms and representations for text categorization
Proceedings of the seventh international conference on Information and knowledge management
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Information Retrieval
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
One-class svms for document classification
The Journal of Machine Learning Research
Building Text Classifiers Using Positive and Unlabeled Examples
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Text classification from positive and unlabeled documents
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Detecting deception through linguistic analysis
ISI'03 Proceedings of the 1st NSF/NIJ conference on Intelligence and security informatics
A longitudinal analysis of language behavior of deception in e-mail
ISI'03 Proceedings of the 1st NSF/NIJ conference on Intelligence and security informatics
Emergent semantics from users' browsing paths
ISI'03 Proceedings of the 1st NSF/NIJ conference on Intelligence and security informatics
On off-topic access detection in information systems
Proceedings of the 14th ACM international conference on Information and knowledge management
Improving classification based off-topic search detection via category relationships
Proceedings of the 2009 ACM symposium on Applied Computing
Illuminating trouble tickets with sublanguage theory
NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Detecting cyber security threats in weblogs using probabilistic models
PAISI'07 Proceedings of the 2007 Pacific Asia conference on Intelligence and security informatics
Role-based differentiation for insider detection algorithms
Proceedings of the 2010 ACM workshop on Insider threats
Hi-index | 0.00 |
Experiments were conducted to test several hypotheses on methods for improving document classification for the malicious insider threat problem within the Intelligence Community. Bag-of-words (BOW) representations of documents were compared to Natural Language Processing (NLP) based representations in both the typical and one-class classification problems using the Support Vector Machine algorithm. Results show that the NLP features significantly improved classifier performance over the BOW approach both in terms of precision and recall, while using many fewer features. The one-class algorithm using NLP features demonstrated robustness when tested on new domains.