Yago: a core of semantic knowledge
Proceedings of the 16th international conference on World Wide Web
Size matters: word count as a measure of quality on wikipedia
Proceedings of the 17th international conference on World Wide Web
Open information extraction from the web
Communications of the ACM - Surviving the data deluge
Web-scale extraction of structured data
ACM SIGMOD Record
It's a contradiction---no, it's not: a case study using functional relations
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Knowledge Maturing in the Semantic MediaWiki: A Design Study in Career Guidance
EC-TEL '09 Proceedings of the 4th European Conference on Technology Enhanced Learning: Learning in the Synergy of Multiple Disciplines
Identifying featured articles in wikipedia: writing style matters
Proceedings of the 19th international conference on World wide web
Objectivity classification in online media
Proceedings of the 21st ACM conference on Hypertext and hypermedia
Towards automatic quality assurance in Wikipedia
Proceedings of the 20th international conference companion on World wide web
Information Systems and e-Business Management
Hi-index | 0.00 |
Nowadays, many decisions are based on information found in the Web. For the most part, the disseminating sources are not certified, and hence an assessment of the quality and credibility of Web content became more important than ever. With factual density we present a simple statistical quality measure that is based on facts extracted from Web content using Open Information Extraction. In a first case study, we use this measure to identify featured/good articles in Wikipedia. We compare the factual density measure with word count, a measure that has successfully been applied to this task in the past. Our evaluation corroborates the good performance of word count in Wikipedia since featured/good articles are often longer than non-featured. However, for articles of similar lengths the word count measure fails while factual density can separate between them with an F-measure of 90.4%. We also investigate the use of relational features for categorizing Wikipedia articles into featured/good versus non-featured ones. If articles have similar lengths, we achieve an F-measure of 86.7% and 84% otherwise.