The nature of statistical learning theory
The nature of statistical learning theory
Enhanced hypertext categorization using hyperlinks
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Inductive learning algorithms and representations for text categorization
Proceedings of the seventh international conference on Information and knowledge management
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
A classifier for semi-structured documents
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Web classification using support vector machine
Proceedings of the 4th international workshop on Web information and data management
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Machine Learning Approach to Web Mining
AI*IA '99 Proceedings of the 6th Congress of the Italian Association for Artificial Intelligence on Advances in Artificial Intelligence
Support Vector Machines for Text Categorization
HICSS '03 Proceedings of the 36th Annual Hawaii International Conference on System Sciences (HICSS'03) - Track 4 - Volume 4
Bayesian network model for semi-structured document classification
Information Processing and Management: an International Journal - Special issue: Bayesian networks and information retrieval
A web classification framework based on XSLT
APWeb'06 Proceedings of the 2006 international conference on Advanced Web and Network Technologies, and Applications
A bottom-up approach for XML documents classification
IDEAS '08 Proceedings of the 2008 international symposium on Database engineering & applications
A bayesian approach to classify conference papers
MICAI'06 Proceedings of the 5th Mexican international conference on Artificial Intelligence
Mining frequent association tag sequences for clustering XML documents
APWeb'12 Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications
X-Class: Associative Classification of XML Documents by Structure
ACM Transactions on Information Systems (TOIS)
Hi-index | 0.00 |
XSLT is a transformation language mainly used for converting XML documents to HTML or other formats. Due to its simplicity and flexibility XML has replaced traditional EDI file formats. Most e-business applications store data in XML, convert XML into HTML using XSTL, and publish the HTML documents to the web. In this paper we argue that the use of XSLT presents an opportunity rather than a challenge to web document classification. We show that it is possible to combine the advantages of both HTML and XML into classification of documents at the XSLT transformation stage, named XSLT classification, to attain higher classification rates using Support Vector Machines (SVM). The results are both expected and promising. We believe that XSLT classification can become a favorable classification method over HTML or XML classification where XSLT stylesheets are available.