Machine Learning
Autoassociator-based models for speaker verification
Pattern Recognition Letters
Learning to extract symbolic knowledge from the World Wide Web
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Support vector domain description
Pattern Recognition Letters - Special issue on pattern recognition in practice VI
Hierarchical classification of Web content
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
A classifier for semi-structured documents
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Text Classification from Labeled and Unlabeled Documents using EM
Machine Learning - Special issue on information retrieval
DEADLINER: building a new niche search engine
Proceedings of the ninth international conference on Information and knowledge management
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Partially Supervised Classification of Text Documents
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Transductive Inference for Text Classification using Support Vector Machines
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Advances in Component-Based Face Detection
SVM '02 Proceedings of the First International Workshop on Pattern Recognition with Support Vector Machines
PEBL: positive example based learning for Web page classification using SVM
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Improving Category Specific Web Search by Learning Query Modifications
SAINT '01 Proceedings of the 2001 Symposium on Applications and the Internet (SAINT 2001)
Reducing multiclass to binary: a unifying approach for margin classifiers
The Journal of Machine Learning Research
One-class svms for document classification
The Journal of Machine Learning Research
Uniform object generation for optimizing one-class classifiers
The Journal of Machine Learning Research
A neural network-based model for paper currency recognition and verification
IEEE Transactions on Neural Networks
General MC: Estimating Boundary of Positive Class from Small Positive Data
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Text Classification without Negative Examples Revisit
IEEE Transactions on Knowledge and Data Engineering
Blocking objectionable web content by leveraging multiple information sources
ACM SIGKDD Explorations Newsletter
Knowing a web page by the company it keeps
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Automatic web pages categorization with ReliefF and Hidden Naive Bayes
Proceedings of the 2007 ACM symposium on Applied computing
Discovering frequent itemsets by support approximation and itemset clustering
Data & Knowledge Engineering
Kernel-based learning for biomedical relation extraction
Journal of the American Society for Information Science and Technology
Effective spam filtering: A single-class learning and ensemble approach
Decision Support Systems
Learning classifiers from only positive and unlabeled data
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Bookmark Category Web Page Classification Using Four Indexing and Clustering Approaches
AH '08 Proceedings of the 5th international conference on Adaptive Hypermedia and Adaptive Web-Based Systems
CRAWLING THE CONSTRUCTION WEB-A MACHINE-LEARNING APPROACH WITHOUT NEGATIVE EXAMPLES
Applied Artificial Intelligence
Identifying web spam with user behavior analysis
AIRWeb '08 Proceedings of the 4th international workshop on Adversarial information retrieval on the web
Web page classification: Features and algorithms
ACM Computing Surveys (CSUR)
Recognizing and Filtering Web Images Based on People's Existence
WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Cool Blog Classification from Positive and Unlabeled Examples
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Building a Text Classifier by a Keyword and Unlabeled Documents
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Review: A review of machine learning approaches to Spam filtering
Expert Systems with Applications: An International Journal
Active Concept Learning For Ontology Evolution
Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence
OcVFDT: one-class very fast decision tree for one-class classification of data streams
Proceedings of the Third International Workshop on Knowledge Discovery from Sensor Data
Building a Text Classifier by a Keyword and Wikipedia Knowledge
ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
Serving Comparative Shopping Links Non-invasively
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Active learning in partially supervised classification
Proceedings of the 18th ACM conference on Information and knowledge management
Extraction of unexpected sentences: A sentiment classification assessed approach
Intelligent Data Analysis
A rough set approach to classifying web page without negative examples
PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
IDEAL'07 Proceedings of the 8th international conference on Intelligent data engineering and automated learning
AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
Disambiguating identity web references using Web 2.0 data and semantics
Web Semantics: Science, Services and Agents on the World Wide Web
The forecasting model based on modified SVRM and PSO penalizing Gaussian noise
Expert Systems with Applications: An International Journal
Rough set and ensemble learning based semi-supervised algorithm for text classification
Expert Systems with Applications: An International Journal
A survey of recent trends in one class classification
AICS'09 Proceedings of the 20th Irish conference on Artificial intelligence and cognitive science
Multi-level log-based relevance feedback scheme for image retrieval
ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications - Volume Part II
Labeling negative examples in supervised learning of new gene regulatory connections
CIBB'10 Proceedings of the 7th international conference on Computational intelligence methods for bioinformatics and biostatistics
A pairwise ranking based approach to learning with positive and unlabeled examples
Proceedings of the 20th ACM international conference on Information and knowledge management
Identifying Web Spam with the Wisdom of the Crowds
ACM Transactions on the Web (TWEB)
Event retrieval in video archives using rough set theory and partially supervised learning
Multimedia Tools and Applications
Query-Based video event definition using rough set theory and high-dimensional representation
MMM'10 Proceedings of the 16th international conference on Advances in Multimedia Modeling
FDIA'09 Proceedings of the Third BCS-IRSG conference on Future Directions in Information Access
Artificial immune system for illicit content identification in social media
Journal of the American Society for Information Science and Technology
Similarity-based approach for positive and unlabelled learning
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Learning very fast decision tree from uncertain data streams with positive and unlabeled samples
Information Sciences: an International Journal
Sampling the Web as Training Data for Text Classification
International Journal of Digital Library Systems
Automatic Item Weight Generation for Pattern Mining and its Application
International Journal of Data Warehousing and Mining
Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries
Robust network traffic identification with unknown applications
Proceedings of the 8th ACM SIGSAC symposium on Information, computer and communications security
Researcher homepage classification using unlabeled data
Proceedings of the 22nd international conference on World Wide Web
Learning from data streams with only positive and unlabeled data
Journal of Intelligent Information Systems
The parallel path framework for entity discovery on the web
ACM Transactions on the Web (TWEB)
Proceedings of the 7th ACM international conference on Web search and data mining
Diversity measures for one-class classifier ensembles
Neurocomputing
A bagging SVM to learn from positive and unlabeled examples
Pattern Recognition Letters
An analytical framework for event mining in video data
Artificial Intelligence Review
Clustering-based ensembles for one-class classification
Information Sciences: an International Journal
Web Intelligence and Agent Systems
Hi-index | 0.01 |
Abstract--Web page classification is one of the essential techniques for Web mining because classifying Web pages of an interesting class is often the first step of mining the Web. However, constructing a classifier for an interesting class requires laborious pre-processing such as collecting positive and negative training examples. For instance, in order to construct a 驴homepage驴 classifier, one needs to collect a sample of homepages (positive examples) and a sample of nonhomepages (negative examples). In particular, collecting negative training examples requires arduous work and caution to avoid bias. This paper presents a framework, called Positive Example Based Learning (PEBL), for Web page classification which eliminates the need for manually collecting negative training examples in preprocessing. The PEBL framework applies an algorithm, called Mapping-Convergence (M-C), to achieve high classification accuracy (with positive and unlabeled data) as high as that of a traditional SVM (with positive and negative data). M-C runs in two stages: the mapping stage and convergence stage. In the mapping stage, the algorithm uses a weak classifier that draws an initial approximation of 驴strong驴 negative data. Based on the initial approximation, the convergence stage iteratively runs an internal classifier (e.g., SVM) which maximizes margins to progressively improve the approximation of negative data. Thus, the class boundary eventually converges to the true boundary of the positive class in the feature space. We present the M-C algorithm with supporting theoretical and experimental justifications. Our experiments show that, given the same set of positive examples, the M-C algorithm outperforms one-class SVMs, and it is almost as accurate as the traditional SVMs.