Spelling checkers,spelling correctors and the misspellings of poor spellers
Information Processing and Management: an International Journal
Overview of the second text retrieval conference (TREC-2)
TREC-2 Proceedings of the second conference on Text retrieval conference
Mining e-mail content for author identification forensics
ACM SIGMOD Record
PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth
Proceedings of the 17th International Conference on Data Engineering
Efficiently mining frequent trees in a forest
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Authorship Attribution with Support Vector Machines
Applied Intelligence
XRules: an effective structural classifier for XML data
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Style mining of electronic messages for multiple authorship discrimination: first results
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
CTC — Correlating Tree Patterns for Classification
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Journal of the American Society for Information Science and Technology
Authorship attribution with thousands of candidate authors
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Mining Frequent Induced Subtrees by Prefix-Tree-Projected Pattern Growth
WAIMW '06 Proceedings of the Seventh International Conference on Web-Age Information Management Workshops
Linguistic correlates of style: authorship classification with deep linguistic analysis features
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Searching with style: authorship attribution in classic literature
ACSC '07 Proceedings of the thirtieth Australasian conference on Computer science - Volume 62
Mining significant graph patterns by leap search
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Direct mining of discriminative and essential frequent patterns via model-based search tree
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
A survey of modern authorship attribution methods
Journal of the American Society for Information Science and Technology
Application of Information Retrieval Techniques for Source Code Authorship Attribution
DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
Direct Discriminative Pattern Mining for Effective Classification
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Classification of software behaviors for failure detection: a discriminative pattern mining approach
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Correlated itemset mining in ROC space: a constraint programming approach
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Authorship attribution and verification with many authors and limited data
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Authorship attribution via combination of evidence
ECIR'07 Proceedings of the 29th European conference on IR research
NDPMine: efficiently mining discriminative numerical features for pattern-based classification
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part II
Effective and scalable authorship attribution using function words
AIRS'05 Proceedings of the Second Asia conference on Asia Information Retrieval Technology
Using relative entropy for authorship attribution
AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
The construction complexity of orgraphs: Some mathematical models and their applications
Automatic Documentation and Mathematical Linguistics
Hi-index | 0.00 |
In the past, there have been dozens of studies on automatic authorship classification, and many of these studies concluded that the writing style is one of the best indicators for original authorship. From among the hundreds of features which were developed, syntactic features were best able to reflect an author's writing style. However, due to the high computational complexity for extracting and computing syntactic features, only simple variations of basic syntactic features such as function words, POS(Part of Speech) tags, and rewrite rules were considered. In this paper, we propose a new feature set of k-embedded-edge subtree patterns that holds more syntactic information than previous feature sets. We also propose a novel approach to directly mining them from a given set of syntactic trees. We show that this approach reduces the computational burden of using complex syntactic structures as the feature set. Comprehensive experiments on real-world datasets demonstrate that our approach is reliable and more accurate than previous studies.