C4.5: programs for machine learning
C4.5: programs for machine learning
Exploratory mining and pruning optimizations of constrained associations rules
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Pruning and summarizing the discovered associations
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Growing decision trees on support-less association rules
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
SLIQ: A Fast Scalable Classifier for Data Mining
EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Construct robust rule sets for classification
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
A Lazy Approach to Pruning Classification Rules
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Constraint-Based Rule Mining in Large, Dense Databases
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Data Mining Methods for Detection of New Malicious Executables
SP '01 Proceedings of the 2001 IEEE Symposium on Security and Privacy
On support thresholds in associative classification
Proceedings of the 2004 ACM symposium on Applied computing
An associative classifier based on positive and negative rules
Proceedings of the 9th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Learning to detect malicious executables in the wild
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
MMAC: A New Multi-Class, Multi-Label Associative Classification Approach
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
An Evaluation of Approaches to Classification Rule Selection
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Static Analyzer of Vicious Executables (SAVE)
ACSAC '04 Proceedings of the 20th Annual Computer Security Applications Conference
CCCS: a top-down associative classifier for imbalanced class distribution
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
MCAR: multi-class classification based on association rule
AICCSA '05 Proceedings of the ACS/IEEE 2005 International Conference on Computer Systems and Applications
IMDS: intelligent malware detection system
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining specifications of malicious behavior
Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering
A review of associative classification mining
The Knowledge Engineering Review
LIBLINEAR: A Library for Large Linear Classification
The Journal of Machine Learning Research
vEye: behavioral footprinting for self-propagating worm detection and profiling
Knowledge and Information Systems
Direct Discriminative Pattern Mining for Effective Classification
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Automated classification and analysis of internet malware
RAID'07 Proceedings of the 10th international conference on Recent advances in intrusion detection
Feature reduction to speed up malware classification
NordSec'11 Proceedings of the 16th Nordic conference on Information Security Technology for Applications
Zero-day malware detection based on supervised learning algorithms of API call signatures
AusDM '11 Proceedings of the Ninth Australasian Data Mining Conference - Volume 121
Detecting malicious behaviour using supervised learning algorithms of the function calls
International Journal of Electronic Security and Digital Forensics
Hi-index | 0.00 |
Malware is software designed to infiltrate or damage a computer system without the owner's informed consent (e.g., viruses, backdoors, spyware, trojans, and worms). Nowadays, numerous attacks made by the malware pose a major security threat to computer users. Unfortunately, along with the development of the malware writing techniques, the number of file samples that need to be analyzed, named "gray list," on a daily basis is constantly increasing. In order to help our virus analysts, quickly and efficiently pick out the malicious executables from the "gray list," an automatic and robust tool to analyze and classify the file samples is needed. In our previous work, we have developed an intelligent malware detection system (IMDS) by adopting associative classification method based on the analysis of application programming interface (API) execution calls. Despite its good performance in malware detection, IMDS still faces the following two challenges: 1) handling the large set of the generated rules to build the classifier; and 2) finding effective rules for classifying new file samples. In this paper, we first systematically evaluate the effects of the postprocessing techniques (e.g., rule pruning, rule ranking, and rule selection) of associative classification in malware detection, and then, propose an effective way, i.e., CIDCPF, to detect the malware from the "gray list." To the best of our knowledge, this is the first effort on using postprocessing techniques of associative classification in malware detection. CIDCPF adapts the postprocessing techniques as follows: first applying Chi-square testing and Insignificant rule pruning followed by using Database coverage based on the Chi-square measure rule ranking mechanism and Pessimistic error estimation, and finally performing prediction by selecting the best First rule. We have incorporated the CIDCPF method into our existing IMDS system, and we call the new system as CIMDS system. Case studies are performed on the large collection of file samples obtained from the Antivirus Laboratory at Kingsoft Corporation and promising experimental results demonstrate that the efficiency and ability of detecting malware from the "gray list" of our CIMDS system outperform popular antivirus software tools, such as McAfee VirusScan and Norton AntiVirus, as well as previous data-mining-based detection systems, which employed Naive Bayes, support vector machine, and decision tree techniques. In particular, our CIMDS system can greatly reduce the number of generated rules, which makes it easy for our virus analysts to identify the useful ones.