Machine Learning
A study of thresholding strategies for text categorization
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
Hierarchical document categorization with support vector machines
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Support vector machines classification with a very large-scale taxonomy
ACM SIGKDD Explorations Newsletter - Natural language processing and text mining
Incremental Algorithms for Hierarchical Classification
The Journal of Machine Learning Research
Pegasos: Primal Estimated sub-GrAdient SOlver for SVM
Proceedings of the 24th international conference on Machine learning
The Journal of Machine Learning Research
Wikify!: linking documents to encyclopedic knowledge
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Boosting Inductive Transfer for Text Classification Using Wikipedia
ICMLA '07 Proceedings of the Sixth International Conference on Machine Learning and Applications
Wikipedia-Based Kernels for Text Categorization
SYNASC '07 Proceedings of the Ninth International Symposium on Symbolic and Numeric Algorithms for Scientific Computing
Introduction to Information Retrieval
Introduction to Information Retrieval
Building semantic kernels for text classification using wikipedia
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
LIBLINEAR: A Library for Large Linear Classification
The Journal of Machine Learning Research
Making more wikipedians: facilitating semantics reuse for wikipedia authoring
ISWC'07/ASWC'07 Proceedings of the 6th international The semantic web and 2nd Asian conference on Asian semantic web conference
Learning simple Wikipedia: a cogitation in ascertaining abecedarian language
CL&W '10 Proceedings of the NAACL HLT 2010 Workshop on Computational Linguistics and Writing: Writing Processes and Authoring Aids
Collective classification of congressional floor-debate transcripts
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Hi-index | 0.00 |
Wikipedia's article contents and its category hierarchy are widely used to produce semantic resources which improve performance on tasks like text classification and keyword extraction. The reverse -- using text classification methods for predicting the categories of Wikipedia articles -- has attracted less attention so far. We propose to "return the favor" and use text classifiers to improve Wikipedia. This could support the emergence of a virtuous circle between the wisdom of the crowds and machine learning/NLP methods. We define the categorization of Wikipedia articles as a multi-label classification task, describe two solutions to the task, and perform experiments that show that our approach is feasible despite the high number of labels.