Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Notes and references on early automatic classification work
ACM SIGIR Forum
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Query expansion using local and global document analysis
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Context-sensitive learning methods for text categorization
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
A scalable comparison-shopping agent for the World-Wide Web
AGENTS '97 Proceedings of the first international conference on Autonomous agents
Generating finite-state transducers for semi-structured data extraction from the Web
Information Systems - Special issue on semistructured data
Deriving concept hierarchies from text
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Learning page-independent heuristics for extracting data from Web pages
WWW '99 Proceedings of the eighth international conference on World Wide Web
Text Classification from Labeled and Unlabeled Documents using EM
Machine Learning - Special issue on information retrieval
An information-theoretic approach to automatic query expansion
ACM Transactions on Information Systems (TOIS)
IEPAD: information extraction based on pattern discovery
Proceedings of the 10th international conference on World Wide Web
A study of thresholding strategies for text categorization
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
A flexible learning system for wrapping tables and lists in HTML documents
Proceedings of the 11th international conference on World Wide Web
Template detection via data mining and its applications
Proceedings of the 11th international conference on World Wide Web
Anchor Text Mining for Translation of Web Queries
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Scaling up. Using the WWW to Resolve PP Attachment Ambiguities
KONVENS 2000 / Sprachkommunikation, Vorträge der gemeinsamen Veranstaltung 5. Konferenz zur Verarbeitung natürlicher Sprache (KONVENS), 6. ITG-Fachtagung "Sprachkommunikation"
Towards Automatic Generation of Query Taxonomy: A Hierarchical Query Clustering Approach
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Mining data records in Web pages
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
The infocious web search engine: improving web searching through linguistic analysis
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Site abstraction for rare category classification in large-scale web directory
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Hierarchical Dirichlet model for document classification
ICML '05 Proceedings of the 22nd international conference on Machine learning
Query taxonomy generation for web search
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Web-based text classification in the absence of manually labeled training documents
Journal of the American Society for Information Science and Technology
Advertising keyword suggestion based on concept hierarchy
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Web page classification: Features and algorithms
ACM Computing Surveys (CSUR)
Prototype hierarchy based clustering for the categorization and navigation of web collections
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Web-Based Verification on the Representativeness of Terms Extracted from Single Short Documents
WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 03
AIRS'05 Proceedings of the Second Asia conference on Asia Information Retrieval Technology
Classifying web data in directory structures
APWeb'06 Proceedings of the 8th Asia-Pacific Web conference on Frontiers of WWW Research and Development
Automatic web pages hierarchical classification using dynamic domain ontologies
International Journal of Knowledge and Web Intelligence
Web directory construction using lexical chains
NLDB'05 Proceedings of the 10th international conference on Natural Language Processing and Information Systems
Text classification using web corpora and EM algorithms
AIRS'04 Proceedings of the 2004 international conference on Asian Information Retrieval Technology
Annotating text segments using a web-based categorization approach
ICADL'05 Proceedings of the 8th international conference on Asian Digital Libraries: implementing strategies and sharing experiences
DEXA'07 Proceedings of the 18th international conference on Database and Expert Systems Applications
Sampling the Web as Training Data for Text Classification
International Journal of Digital Library Systems
Hi-index | 0.00 |
Many Web information services utilize techniques of information extraction(IE) to collect important facts from the Web. To create more advanced services, one possible method is to discover thematic information from the collected facts through text classification. However, most conventional text classification techniques rely on manual-labelled corpora and are thus ill-suited to cooperate with Web information services with open domains. In this work, we present a system named LiveClassifier that can automatically train classifiersthrough Web corpora based on user-defined topic hierarchies. Due to its flexibility and convenience, LiveClassifier can be easily adapted for various purposes. New Web information services can be created to fully exploit it; human users can use it to create classifiers for their personal applications. The effectiveness of classifiers created by LiveClassifier is well supportedby empirical evidence.