On Intelligence
K-gram based software birthmarks
Proceedings of the 2005 ACM symposium on Applied computing
A Software Birthmark Based on Dynamic Opcode n-gram
ICSC '07 Proceedings of the International Conference on Semantic Computing
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
Algorithms in Bioinformatics: A Practical Introduction
Algorithms in Bioinformatics: A Practical Introduction
Computing the behavior of malicious code with function extraction technology
Proceedings of the 5th Annual Workshop on Cyber Security and Information Intelligence Research: Cyber Security and Information Intelligence Challenges and Strategies
File Fragment Classification-The Case for Specialized Approaches
SADFE '09 Proceedings of the 2009 Fourth International IEEE Workshop on Systematic Approaches to Digital Forensic Engineering
Concurrent Architecture for Automated Malware Classification
HICSS '10 Proceedings of the 2010 43rd Hawaii International Conference on System Sciences
Differentiating code from data in x86 binaries
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
BitShred: feature hashing malware for scalable triage and semantic analysis
Proceedings of the 18th ACM conference on Computer and communications security
Statistical Learning for File-Type Identification
ICMLA '11 Proceedings of the 2011 10th International Conference on Machine Learning and Applications and Workshops - Volume 01
Hi-index | 0.00 |
The amount of digital evidence that must be processed by forensic tools and analysts is growing rapidly. This makes automated analysis a critical activity; an activity where continuous improvement is crucial. Concordia is a platform for investigating code semantics. One of Concordia's functions is identification of unknown code fragments; attempting to elucidate the possible objectives and origination of this type of evidence is our ultimate goal. Here we provide a synopsis of a method that identifies and locates code fragments using n-gram and semantics-based features and a k nearest neighbors classifier. Our objective is to identify a set of candidate files that may contain the unknown and supply additional details to isolate it within this set. To accomplish this task, Concordia uses the MapReduce model to process a large set of invariants to provide forensic experts a more efficient and automated way to produce solid intelligence about a growing body of evidence.