TCS: a shell for content-based text categorization
Proceedings of the sixth conference on Artificial intelligence applications
Retrieval strategies for hypertext
Information Processing and Management: an International Journal - Special issue on hypertext and information retrieval
Towards language independent automated learning of text categorization models
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
An extended vector-processing scheme for searching information in hypertext systems
Information Processing and Management: an International Journal
Enhanced hypertext categorization using hyperlinks
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Improved algorithms for topic distillation in a hyperlinked environment
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Learning to extract symbolic knowledge from the World Wide Web
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Authoritative sources in a hyperlinked environment
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Mining the Web's Link Structure
Computer
IEEE Intelligent Systems
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Text genre classification with genre-revealing and subject-revealing features
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Categorizing information objects from user access patterns
Proceedings of the eleventh international conference on Information and knowledge management
Web classification using support vector machine
Proceedings of the 4th international workshop on Web information and data management
A Study of Approaches to Hypertext Categorization
Journal of Intelligent Information Systems
Text categorization based on k-nearest neighbor approach for web site classification
Information Processing and Management: an International Journal
PEBL: positive example based learning for Web page classification using SVM
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Improving web search by the identification of contextual information
Intelligent exploration of the web
PEBL: Web Page Classification without Negative Examples
IEEE Transactions on Knowledge and Data Engineering
Combining link-based and content-based methods for web document classification
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Link mining: a new data mining challenge
ACM SIGKDD Explorations Newsletter
GE-CKO: A Method to Optimize Composite Kernels for Web Page Classification
WI '04 Proceedings of the 2004 IEEE/WIC/ACM International Conference on Web Intelligence
Intelligent GP fusion from multiple sources for text classification
Proceedings of the 14th ACM international conference on Information and knowledge management
ACM SIGKDD Explorations Newsletter
Reinforcing Web-object Categorization Through Interrelationships
Data Mining and Knowledge Discovery
A comparison of implicit and explicit links for web page classification
Proceedings of the 15th international conference on World Wide Web
A comparative study of citations and links in document classification
Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Graph-based text classification: learn from your neighbors
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
A Voting Method for the Classification of Web Pages
WI-IATW '06 Proceedings of the 2006 IEEE/WIC/ACM international conference on Web Intelligence and Intelligent Agent Technology
Two-phase Web site classification based on Hidden Markov Tree models
Web Intelligence and Agent Systems
Automatic web pages categorization with ReliefF and Hidden Naive Bayes
Proceedings of the 2007 ACM symposium on Applied computing
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Combining content and link for classification using matrix factorization
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
A machine learning approach to web page filtering using content and structure analysis
Decision Support Systems
Finding and classifying web units in websites
International Journal of Business Intelligence and Data Mining
Floatcascade learning for fast imbalanced web mining
Proceedings of the 17th international conference on World Wide Web
A comparative evaluation of different link types on enhancing document clustering
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Classifiers without borders: incorporating fielded text from neighboring web pages
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Web page classification: Features and algorithms
ACM Computing Surveys (CSUR)
Graffiti: node labeling in heterogeneous networks
Proceedings of the 18th international conference on World wide web
A Fast Method for Property Prediction in Graph-Structured Data from Positive and Unlabelled Examples
Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence
sDoc: exploring social wisdom for document enhancement in web mining
Proceedings of the 18th ACM conference on Information and knowledge management
Enabling multi-level relevance feedback on pubmed by integrating rank learning into DBMS
Proceedings of the third international workshop on Data and text mining in bioinformatics
Journal of Management Information Systems
Modelling citation networks for improving scientific paper classification performance
PRICAI'06 Proceedings of the 9th Pacific Rim international conference on Artificial intelligence
Text and hypertext categorization
Artificial intelligence
Classifying documents with link-based bibliometric measures
Information Retrieval
An expert system for detecting automobile insurance fraud using social network analysis
Expert Systems with Applications: An International Journal
Costco: robust content and structure constrained clustering of networked documents
CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part II
A solution to the exact match on rare item searches: introducing the lost sheep algorithm
Proceedings of the International Conference on Web Intelligence, Mining and Semantics
Combining file content and file relations for cloud based malware detection
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
On discovering concept entities from web sites
ICCSA'05 Proceedings of the 2005 international conference on Computational Science and Its Applications - Volume Part II
A web classification framework based on XSLT
APWeb'06 Proceedings of the 2006 international conference on Advanced Web and Network Technologies, and Applications
Classification of XSLT-Generated web documents with support vector machines
KDXD'06 Proceedings of the First international conference on Knowledge Discovery from XML Documents
Proceedings of the Third Symposium on Information and Communication Technology
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
Efficient semantic network construction with application to PubMed search
Knowledge-Based Systems
A document is known by the company it keeps: neighborhood consensus for short text categorization
Language Resources and Evaluation
Iterative classification for multiple target attributes
Journal of Intelligent Information Systems
Hi-index | 0.00 |
As WWW grows at an increasing speed, a classifier targeted at hypertext has become in high demand. While document categorization is quite a mature, the issue of utilizing hypertext structure and hyperlinks has been relatively unexplored. In this paper, we propose a practical method for enhancing both the speed and the quality of hypertext categorization using hyperlinks. In comparison against a recently proposed technique that appears to be the only one of the kind, we obtained up to 18.5% of improvement in effectiveness while reducing the processing time dramatically. We attempt to explain through experiments what factors contribute to the improvement.