Enhanced hypertext categorization using hyperlinks
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
A classifier for semi-structured documents
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Web classification using support vector machine
Proceedings of the 4th international workshop on Web information and data management
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval
ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Machine Learning Approach to Web Mining
AI*IA '99 Proceedings of the 6th Congress of the Italian Association for Artificial Intelligence on Advances in Artificial Intelligence
Bayesian network model for semi-structured document classification
Information Processing and Management: an International Journal - Special issue: Bayesian networks and information retrieval
Classification of XSLT-Generated web documents with support vector machines
KDXD'06 Proceedings of the First international conference on Knowledge Discovery from XML Documents
Hi-index | 0.00 |
Data on the web is gradually changing format from HTML to XML/XSLT driven by various software and hardware requirements such as interoperability and data-sharing problems between different applications/platforms, devices with vairous capabilities like cell phones, PDAs. This gradual change introduces new challenges in web page and web site classification. HTML is used for presentation of content. XML represents content in a hierarchical manner. XSLT is used to transform XML documents into different formats such as HTML, WML. There are certain drawbacks in HTML and XML classifications for classifying a web page. In this paper we propose a new classification method based on XSLT which is able to combine the advantages of HTML and XML classifications. We also introduce a web classification framework utilizing XSLT classification. Finally we show that using Naïve Bayes classifier XSLT classification outperfoms both HTML and XML classifications.