Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
A content-driven reputation system for the wikipedia
Proceedings of the 16th international conference on World Wide Web
Does it matter who contributes: a study on featured articles in the german wikipedia
Proceedings of the eighteenth conference on Hypertext and hypermedia
Measuring article quality in wikipedia: models and evaluation
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Size matters: word count as a measure of quality on wikipedia
Proceedings of the 17th international conference on World Wide Web
Computing trust from revision history
Proceedings of the 2006 International Conference on Privacy, Security and Trust: Bridge the Gap Between PST Technologies and Business Services
Network analysis of collaboration structure in Wikipedia
Proceedings of the 18th international conference on World wide web
A survey of modern authorship attribution methods
Journal of the American Society for Information Science and Technology
Towards automatic quality assurance in Wikipedia
Proceedings of the 20th international conference companion on World wide web
Classifying with co-stems: a new representation for information filtering
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Detection of text quality flaws as a one-class classification problem
Proceedings of the 20th ACM international conference on Information and knowledge management
Characterizing Wikipedia pages using edit network motif profiles
Proceedings of the 3rd international workshop on Search and mining user-generated contents
Measuring the quality of web content using factual information
Proceedings of the 2nd Joint WICOW/AIRWeb Workshop on Web Quality
A breakdown of quality flaws in Wikipedia
Proceedings of the 2nd Joint WICOW/AIRWeb Workshop on Web Quality
Predicting quality flaws in user-generated content: the case of wikipedia
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Classifying Wikipedia articles using network motif counts and ratios
Proceedings of the Eighth Annual International Symposium on Wikis and Open Collaboration
Tell me more: an actionable quality model for Wikipedia
Proceedings of the 9th International Symposium on Open Collaboration
Proceedings of the 23rd international conference on World wide web
Hi-index | 0.00 |
Wikipedia provides an information quality assessment model with criteria for human peer reviewers to identify featured articles. For this classification task "Is an article featured or not?" we present a machine learning approach that exploits an article's character trigram distribution. Our approach differs from existing research in that it aims to writing style rather than evaluating meta features like the edit history. The approach is robust, straightforward to implement, and outperforms existing solutions. We underpin these claims by an experiment design where, among others, the domain transferability is analyzed. The achieved performances in terms of the F-measure for featured articles are 0.964 within a single Wikipedia domain and 0.880 in a domain transfer situation.