Spelling checkers,spelling correctors and the misspellings of poor spellers
Information Processing and Management: an International Journal
Mining e-mail content for author identification forensics
ACM SIGMOD Record
PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth
Proceedings of the 17th International Conference on Data Engineering
Efficiently mining frequent trees in a forest
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Authorship Attribution with Support Vector Machines
Applied Intelligence
CloseGraph: mining closed frequent graph patterns
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
XRules: an effective structural classifier for XML data
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Style mining of electronic messages for multiple authorship discrimination: first results
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
BIDE: Efficient Mining of Frequent Closed Sequences
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Mining Closed and Maximal Frequent Subtrees from Databases of Labeled Rooted Trees
IEEE Transactions on Knowledge and Data Engineering
CTC — Correlating Tree Patterns for Classification
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Journal of the American Society for Information Science and Technology
Mining Frequent Induced Subtrees by Prefix-Tree-Projected Pattern Growth
WAIMW '06 Proceedings of the Seventh International Conference on Web-Age Information Management Workshops
Linguistic correlates of style: authorship classification with deep linguistic analysis features
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Structure and semantics for expressive text kernels
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
DryadeParent, An Efficient and Robust Closed Attribute Tree Mining Algorithm
IEEE Transactions on Knowledge and Data Engineering
Mining significant graph patterns by leap search
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
A survey of modern authorship attribution methods
Journal of the American Society for Information Science and Technology
Direct Discriminative Pattern Mining for Effective Classification
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Classification of software behaviors for failure detection: a discriminative pattern mining approach
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
A syntactic tree matching approach to finding similar questions in community-based qa services
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Efficient convolution kernels for dependency and constituent syntactic trees
ECML'06 Proceedings of the 17th European conference on Machine Learning
Tree2: decision trees for tree structured data
PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
Effective and scalable authorship attribution using function words
AIRS'05 Proceedings of the Second Asia conference on Asia Information Retrieval Technology
Using relative entropy for authorship attribution
AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
Native language detection with tree substitution grammars
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Combining Entity Matching Techniques for Detecting Extremist Behavior on Discussion Boards
ASONAM '12 Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)
Hi-index | 0.00 |
In the past, there have been dozens of studies on automatic authorship classification, and many of these studies concluded that the writing style is one of the best indicators of original authorship. From among the hundreds of features which were developed, syntactic features were best able to reflect an author's writing style. However, due to the high computational complexity of extracting and computing syntactic features, only simple variations of basic syntactic features of function words and part-of-speech tags were considered. In this paper, we propose a novel approach to mining discriminative k-embedded-edge subtree patterns from a given set of syntactic trees that reduces the computational burden of using complex syntactic structures as a feature set. This method is shown to increase the classification accuracy. We also design a new kernel based on these features. Comprehensive experiments on real datasets of news articles and movie reviews demonstrate that our approach is reliable and more accurate than previous studies.