Probabilistic reasoning in intelligent systems: networks of plausible inference
Probabilistic reasoning in intelligent systems: networks of plausible inference
Original Contribution: Stacked generalization
Neural Networks
Distributional clustering of words for text classification
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Exploiting generative models in discriminative classifiers
Proceedings of the 1998 conference on Advances in neural information processing systems II
An introduction to support Vector Machines: and other kernel-based learning methods
An introduction to support Vector Machines: and other kernel-based learning methods
A classifier for semi-structured documents
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Support vector machines: hype or hallelujah?
ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
Machine Learning
Modern Information Retrieval
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
A Tutorial on Support Vector Machines for Pattern Recognition
Data Mining and Knowledge Discovery
A semi-structured document model for text mining
Journal of Computer Science and Technology
IEEE Intelligent Systems
ECML '93 Proceedings of the European Conference on Machine Learning
Kernels for Semi-Structured Data
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Challenges of the Email Domain for Text Classification
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Improving an Association Rule Based Classifier
PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
RoadRunner: Towards Automatic Data Extraction from Large Web Sites
Proceedings of the 27th International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Text Document Categorization by Term Association
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
An Efficient and Scalable Algorithm for Clustering XML Documents by Structure
IEEE Transactions on Knowledge and Data Engineering
FARMER: finding interesting rule groups in microarray datasets
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
An associative classifier based on positive and negative rules
Proceedings of the 9th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Fast Detection of XML Structural Similarity
IEEE Transactions on Knowledge and Data Engineering
A tree-based approach to clustering XML documents by structure
PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
Efficiently Mining Frequent Trees in a Forest: Algorithms and Applications
IEEE Transactions on Knowledge and Data Engineering
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
CCCS: a top-down associative classifier for imbalanced class distribution
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Introduction to the special issue on XML retrieval
ACM Transactions on Information Systems (TOIS)
XML search: languages, INEX and scoring
ACM SIGMOD Record
Xproj: a framework for projected structural clustering of xml documents
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
A review of associative classification mining
The Knowledge Engineering Review
Measuring the structural similarity of semistructured documents using entropy
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Introduction to Information Retrieval
Introduction to Information Retrieval
Probabilistic Methods for Structured Document Classification at INEX'07
Focused Access to XML Documents
XML Document Classification Using Extended VSM
Focused Access to XML Documents
A bottom-up approach for XML documents classification
IDEAS '08 Proceedings of the 2008 international symposium on Database engineering & applications
Support Vector Machines
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Exploiting structural information for semi-structured document categorization
Information Processing and Management: an International Journal
A methodology for clustering XML documents by structure
Information Systems
Extended VSM for XML document classification using frequent subtrees
INEX'09 Proceedings of the Focused retrieval and evaluation, and 8th international conference on Initiative for the evaluation of XML retrieval
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
Effective XML Classification Using Content and Structural Information via Rule Learning
ICTAI '11 Proceedings of the 2011 IEEE 23rd International Conference on Tools with Artificial Intelligence
Sequential pattern mining for structure-based XML document classification
INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
Transforming XML trees for efficient classification and clustering
INEX'05 Proceedings of the 4th international conference on Initiative for the Evaluation of XML Retrieval
Classification of XSLT-Generated web documents with support vector machines
KDXD'06 Proceedings of the First international conference on Knowledge Discovery from XML Documents
Effects of kernel function on Nu support vector machines in extreme cases
IEEE Transactions on Neural Networks
On Effective XML Clustering by Path Commonality: An Efficient and Scalable Algorithm
ICTAI '12 Proceedings of the 2012 IEEE 24th International Conference on Tools with Artificial Intelligence - Volume 01
Hierarchical clustering of XML documents focused on structural components
Data & Knowledge Engineering
Hi-index | 0.00 |
The supervised classification of XML documents by structure involves learning predictive models in which certain structural regularities discriminate the individual document classes. Hitherto, research has focused on the adoption of prespecified substructures. This is detrimental for classification effectiveness, since the a priori chosen substructures may not accord with the structural properties of the XML documents. Therein, an unexplored question is how to choose the type of structural regularity that best adapts to the structures of the available XML documents. We tackle this problem through X-Class, an approach that handles all types of tree-like substructures and allows for choosing the most discriminatory one. Algorithms are designed to learn compact rule-based classifiers in which the chosen substructures discriminate the classes of XML documents. X-Class is studied across various domains and types of substructures. Its classification performance is compared against several rule-based and SVM-based competitors. Empirical evidence reveals that the classifiers induced by X-Class are compact, scalable, and at least as effective as the established competitors. In particular, certain substructures allow the induction of very compact classifiers that generally outperform the rule-based competitors in terms of effectiveness over all chosen corpora of XML data. Furthermore, such classifiers are substantially as effective as the SVM-based competitor, with the additional advantage of a high-degree of interpretability.