Original Contribution: Stacked generalization
Neural Networks
Automatic feedback using past queries: social searching?
Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Enhanced hypertext categorization using hyperlinks
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Learning to extract symbolic knowledge from the World Wide Web
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Focused crawling: a new approach to topic-specific Web resource discovery
WWW '99 Proceedings of the eighth international conference on World Wide Web
A probabilistic description-oriented approach for categorizing web documents
Proceedings of the eighth international conference on Information and knowledge management
Bringing order to the Web: automatically categorizing search results
Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Hierarchical classification of Web content
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Agglomerative clustering of a search engine query log
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Web page classification based on k-nearest neighbor approach
IRAL '00 Proceedings of the fifth international workshop on on Information retrieval with Asian languages
ACM SIGKDD Explorations Newsletter
Scaling question answering to the Web
Proceedings of the 10th international conference on World Wide Web
Using LSI for text classification in the presence of background text
Proceedings of the tenth international conference on Information and knowledge management
Query clustering using user logs
ACM Transactions on Information Systems (TOIS)
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
The structure of broad topics on the web
Proceedings of the 11th international conference on World Wide Web
Using web structure for classifying and describing web pages
Proceedings of the 11th international conference on World Wide Web
Learning to map between ontologies on the semantic web
Proceedings of the 11th international conference on World Wide Web
Machine Learning
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Strategies for minimising errors in hierarchical web categorisation
Proceedings of the eleventh international conference on Information and knowledge management
Web classification using support vector machine
Proceedings of the 4th international workshop on Web information and data management
Simple and accurate feature selection for hierarchical categorisation
Proceedings of the 2002 ACM symposium on Document engineering
A Study of Approaches to Hypertext Categorization
Journal of Intelligent Information Systems
Text-Learning and Related Intelligent Agents: A Survey
IEEE Intelligent Systems
Mining the Web: Discovering Knowledge from HyperText Data
Mining the Web: Discovering Knowledge from HyperText Data
Composite Kernels for Hypertext Categorisation
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Combining Labeled and Unlabeled Data for MultiClass Text Categorization
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Hierarchical Text Classification and Evaluation
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Discovering Test Set Regularities in Relational Domains
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Hypertext Categorization using Hyperlink Patterns and Meta Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Finding Similar Queries to Satisfy Searches Based on Query Traces
OOIS '02 Proceedings of the Workshops on Advances in Object-Oriented Information Systems
Combining Labeled and Unlabeled Data for Text Classification with a Large Number of Categories
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Text categorization based on k-nearest neighbor approach for web site classification
Information Processing and Management: an International Journal
Optimizing search engines using clickthrough data
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Web site mining: a new way to spot competitors, customers and suppliers in the world wide web
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Automatic Web Page Classification in a Dynamic and Hierarchical Way
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Data mining for hypertext: a tutorial survey
ACM SIGKDD Explorations Newsletter
Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search
IEEE Transactions on Knowledge and Data Engineering
A scalability analysis of classifiers in text categorization
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
The connectivity sonar: detecting site functionality by structural patterns
Proceedings of the fourteenth ACM conference on Hypertext and hypermedia
Support vector machine active learning with applications to text classification
The Journal of Machine Learning Research
Two-Phase Web Site Classification Based on Hidden Markov Tree Models
WI '03 Proceedings of the 2003 IEEE/WIC International Conference on Web Intelligence
WebGuard: Web Based Adult Content Detection and Filtering System
WI '03 Proceedings of the 2003 IEEE/WIC International Conference on Web Intelligence
PEBL: Web Page Classification without Negative Examples
IEEE Transactions on Knowledge and Data Engineering
OntoKhoj: a semantic web portal for ontology searching, ranking and classification
WIDM '03 Proceedings of the 5th ACM international workshop on Web information and data management
Combining link-based and content-based methods for web document classification
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Combining Pattern Classifiers: Methods and Algorithms
Combining Pattern Classifiers: Methods and Algorithms
Liveclassifier: creating hierarchical text classifiers through web corpora
Proceedings of the 13th international conference on World Wide Web
Experiments with open-domain textual Question Answering
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Web-page classification through summarization
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Effectiveness of web page classification on finding list answers
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Extracting fuzzy classification rules from partially labeled data
Soft Computing - A Fusion of Foundations, Methodologies and Applications
Web page classification without the web page
Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters
Automatically collecting, monitoring, and mining japanese weblogs
Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters
Why collective inference improves relational classification
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
ICML '04 Proceedings of the twenty-first international conference on Machine learning
The Combination of Text Classifiers Using Reliability Indicators
Information Retrieval
Using a web-based categorization approach to generate thematic metadata from texts
ACM Transactions on Asian Language Information Processing (TALIP)
Findex: search result categories help users when document ranking fails
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Algorithmic detection of semantic similarity
WWW '05 Proceedings of the 14th international conference on World Wide Web
Mapping the Semantics of Web Text and Links
IEEE Internet Computing
OCFS: optimal orthogonal centroid feature selection for text categorization
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
An application of text categorization methods to gene ontology annotation
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Multi-labelled classification using maximum entropy method
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
PageRank without hyperlinks: structural re-ranking using links induced by language models
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Query chains: learning to rank from implicit feedback
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Support vector machines classification with a very large-scale taxonomy
ACM SIGKDD Explorations Newsletter - Natural language processing and text mining
Fast webpage classification using URL features
Proceedings of the 14th ACM international conference on Information and knowledge management
Learning to rank using gradient descent
ICML '05 Proceedings of the 22nd international conference on Machine learning
ACM SIGKDD Explorations Newsletter
Parsing and question classification for question answering
ODQA '01 Proceedings of the workshop on Open-domain question answering - Volume 12
Understanding how bloggers feel: recognizing affect in blog posts
CHI '06 Extended Abstracts on Human Factors in Computing Systems
Reinforcing Web-object Categorization Through Interrelationships
Data Mining and Knowledge Discovery
Web ontology segmentation: analysis, classification and use
Proceedings of the 15th international conference on World Wide Web
A comparison of implicit and explicit links for web page classification
Proceedings of the 15th international conference on World Wide Web
Beyond PageRank: machine learning for static ranking
Proceedings of the 15th international conference on World Wide Web
ICML '06 Proceedings of the 23rd international conference on Machine learning
Respect my authority!: HITS without hyperlinks, utilizing cluster-based language models
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Topical link analysis for web search
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Adapting ranking SVM to document retrieval
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Graph-based text classification: learn from your neighbors
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Linear prediction models with graph regularization for web-page categorization
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Coarse-grained classification of web sites by their structural properties
WIDM '06 Proceedings of the 8th annual ACM international workshop on Web information and data management
Knowing a web page by the company it keeps
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
A neighborhood-based approach for clustering of linked document collections
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Web-based list question answering
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Discretization based learning approach to information retrieval
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Hierarchical text categorization and its application to bioinformatics
Hierarchical text categorization and its application to bioinformatics
P-TAG: large scale automatic generation of personalized annotation tags for the web
Proceedings of the 16th international conference on World Wide Web
Web page classification with heterogeneous data fusion
Proceedings of the 16th international conference on World Wide Web
Utility analysis for topically biased PageRank
Proceedings of the 16th international conference on World Wide Web
Altering document term vectors for classification: ontologies as expectations of co-occurrence
Proceedings of the 16th international conference on World Wide Web
Classification in Networked Data: A Toolkit and a Univariate Case Study
The Journal of Machine Learning Research
A Novel Web Page Filtering System by Combining Texts and Images
WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
Robust classification of rare queries using web knowledge
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Know your neighbors: web spam detection using the web topology
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Combining content and link for classification using matrix factorization
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
A semantic approach to contextual advertising
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Combining error-correcting output codes and model-refinement for text categorization
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
The Journal of Machine Learning Research
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Solving multiclass learning problems via error-correcting output codes
Journal of Artificial Intelligence Research
Feature generation for text categorization using world knowledge
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
When are links useful? experiments in text classification
ECIR'03 Proceedings of the 25th European conference on IR research
Large scale unstructured document classification using unlabeled data and syntactic information
PAKDD'03 Proceedings of the 7th Pacific-Asia conference on Advances in knowledge discovery and data mining
Blog classification using tags: an empirical study
ICADL'07 Proceedings of the 10th international conference on Asian digital libraries: looking back 10 years and forging new frontiers
Syskill & webert: Identifying interesting web sites
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Importance-based web page classification using cost-sensitive SVM
WAIM'05 Proceedings of the 6th international conference on Advances in Web-Age Information Management
Importance of HTML structural elements and metadata in automated subject classification
ECDL'05 Proceedings of the 9th European conference on Research and Advanced Technology for Digital Libraries
The language of folksonomies: what tags reveal about user classification
NLDB'06 Proceedings of the 11th international conference on Applications of Natural Language to Information Systems
Link-Local features for hypertext classification
EWMF'05/KDO'05 Proceedings of the 2005 joint international conference on Semantics, Web and Mining
Web document clustering using hyperlink structures
Computational Statistics & Data Analysis
Purely URL-based topic classification
Proceedings of the 18th international conference on World wide web
Browsing the underdeveloped Web: An experiment on the Arabic Medical Web Directory
Journal of the American Society for Information Science and Technology
Exploring social tagging graph for web object classification
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Proceedings of the 27th ACM international conference on Design of communication
Novel web page classification techniques in contextual advertising
Proceedings of the eleventh international workshop on Web information and data management
Multi-modality in one-class classification
Proceedings of the 19th international conference on World wide web
Fast dimension reduction for document classification based on imprecise spectrum analysis
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Using semantic techniques to access web data
Information Systems
A Web page classification system based on a genetic algorithm using tagged-terms as features
Expert Systems with Applications: An International Journal
A combined topical/non-topical approach to identifying web sites for children
Proceedings of the fourth ACM international conference on Web search and data mining
Use of Medical Subject Headings (MeSH) in Portuguese for categorizing web-based healthcare content
Journal of Biomedical Informatics
Design and implementation of contextual information portals
Proceedings of the 20th international conference companion on World wide web
The SHARC framework for data quality in Web archiving
The VLDB Journal — The International Journal on Very Large Data Bases
Foundations and Trends in Information Retrieval
A solution to the exact match on rare item searches: introducing the lost sheep algorithm
Proceedings of the International Conference on Web Intelligence, Mining and Semantics
A Comprehensive Study of Features and Algorithms for URL-Based Topic Classification
ACM Transactions on the Web (TWEB)
On identifying academic homepages for digital libraries
Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries
Learning search tasks in queries and web pages via graph regularization
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Journal of Biomedical Informatics
Automatic maintenance of web directories by mining web browsing data
Journal of Web Engineering
Balance support vector machines locally using the structural similarity kernel
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
Myngle: unifying and filtering web content for unplanned access between multiple personal devices
Proceedings of the 13th international conference on Ubiquitous computing
Topical categorization of search results based on a domain ontology
ICTIR'11 Proceedings of the Third international conference on Advances in information retrieval theory
Autonomous and adaptive identification of topics in unstructured text
KES'11 Proceedings of the 15th international conference on Knowledge-based and intelligent information and engineering systems - Volume Part II
Enhance web pages genre identification using neighboring pages
WISE'11 Proceedings of the 12th international conference on Web information system engineering
Selecting Answers to Questions from Web Documents by a Robust Validation Process
WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
Expert Systems with Applications: An International Journal
Analyzing Online Review Helpfulness Using a Regressional ReliefF-Enhanced Text Mining Method
ACM Transactions on Management Information Systems (TMIS)
Classifying Arabic web pages toolkit
Proceedings of the 2nd International Conference on Web Intelligence, Mining and Semantics
A new search engine integrating hierarchical browsing and keyword search
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Efficient classifiers for multi-class classification problems
Decision Support Systems
On automatically tagging web documents from examples
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Hierarchical classification of web documents by stratified discriminant analysis
IRFC'12 Proceedings of the 5th conference on Multidisciplinary Information Retrieval
Exploiting latent relevance for relational learning of ubiquitous things
Proceedings of the 21st ACM international conference on Information and knowledge management
Fast dimension reduction for document classification based on imprecise spectrum analysis
Information Sciences: an International Journal
A Cognitive Framework for Core Language Understanding and its Computational Implementation
International Journal of Cognitive Informatics and Natural Intelligence
Competitive intelligence for SMEs: a web-based decision support system
International Journal of Business Information Systems
CatStream: categorising tweets for user profiling and stream filtering
Proceedings of the 2013 international conference on Intelligent user interfaces
A comparative study of classifier combination applied to NLP tasks
Information Fusion
Towards automatic assessment of government web sites
Proceedings of the 3rd International Conference on Web Intelligence, Mining and Semantics
A supervised machine learning classification algorithm for research articles
Proceedings of the 28th Annual ACM Symposium on Applied Computing
Browse with a social web directory
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Researcher homepage classification using unlabeled data
Proceedings of the 22nd international conference on World Wide Web
The parallel path framework for entity discovery on the web
ACM Transactions on the Web (TWEB)
What's the deal?: identifying online bargains
AWC '13 Proceedings of the First Australasian Web Conference - Volume 144
Research on adaptive classification algorithm based on non-segment and classified-centre-vector
International Journal of Intelligent Information and Database Systems
Serefind: a crowd-powered search engine
Proceedings of the companion publication of the 17th ACM conference on Computer supported cooperative work & social computing
CALA: An unsupervised URL-based web page classification system
Knowledge-Based Systems
Explaining data-driven document classifications
MIS Quarterly
Hi-index | 0.00 |
Classification of Web page content is essential to many tasks in Web information retrieval such as maintaining Web directories and focused crawling. The uncontrolled nature of Web content presents additional challenges to Web page classification as compared to traditional text classification, but the interconnected nature of hypertext also provides features that can assist the process. As we review work in Web page classification, we note the importance of these Web-specific features and algorithms, describe state-of-the-art practices, and track the underlying assumptions behind the use of information from neighboring pages.