Automatically characterizing resource quality for educational digital libraries

Authors:
Steven Bethard;Philipp Wetzer;Kirsten Butcher;James H. Martin;Tamara Sumner
Affiliations:
University of Colorado, Boulder, CO, USA;University of Colorado, Boulder, CO, USA;University of Utah, Salt Lake City, UT, USA;University of Colorado, Boulder, CO, USA;University of Colorado, Boulder, CO, USA
Venue:
Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries
Year:
2009

Citing 13
Cited 6

Making large-scale support vector machine learning practical

Advances in kernel methods
Incorporating quality metrics in centralized/distributed information retrieval on the World Wide Web

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Empirically validated web page design metrics

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
What makes Web sites credible?: a report on a large quantitative study

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Judgement of information quality and cognitive authority in the Web

Journal of the American Society for Information Science and Technology
Understanding educator perceptions of "quality" in digital libraries

Proceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries
Untangling compound documents on the web

Proceedings of the fourteenth ACM conference on Hypertext and hypermedia
Automatic evaluation of aspects of document quality

Proceedings of the 22nd annual international conference on Design of communication: The engineering of quality documentation
A content-driven reputation system for the wikipedia

Proceedings of the 16th international conference on World Wide Web
As we may perceive: finding the boundaries of compound documents on the web

Proceedings of the 17th international conference on World Wide Web
Size matters: word count as a measure of quality on wikipedia

Proceedings of the 17th international conference on World Wide Web
Exploring educational standard alignment: in search of 'relevance'

Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries
Computing trust from revision history

Proceedings of the 2006 International Conference on Privacy, Security and Trust: Bridge the Gap Between PST Technologies and Business Services

Statistical profiles of highly-rated learning objects

Computers & Education
Automating open educational resources assessments: a machine learning generalization study

Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries
Automatic Assessment of Document Quality in Web Collaborative Digital Libraries

Journal of Data and Information Quality (JDIQ)
Studying teacher selection of resources in an ultra-large scale interactive system: Does metadata guide the way?

Computers & Education
Open educational resource assessments (OPERA)

ITS'10 Proceedings of the 10th international conference on Intelligent Tutoring Systems - Volume Part II
Characterizing and Predicting the Multifaceted Nature of Quality in Educational Web Resources

ACM Transactions on Interactive Intelligent Systems (TiiS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

With the rise of community-generated web content, the need for automatic characterization of resource quality has grown, particularly in the realm of educational digital libraries. We demonstrate how identifying concrete factors of quality for web-based educational resources can make machine learning approaches to automating quality characterization tractable. Using data from several previous studies of quality, we gathered a set of key dimensions and indicators of quality that were commonly identified by educators. We then performed a mixed-method study of digital library curation experts, showing that our characterization of quality captured the subjective processes used by the experts when assessing resource quality for classroom use. Using key indicators of quality selected from a statistical analysis of our expert study data, we developed a set of annotation guidelines and annotated a corpus of 1000 digital resources for the presence or absence of these key quality indicators. Agreement among annotators was high, and initial machine learning models trained from this corpus were able to identify some indicators of quality with as much as an 18% improvement over the baseline.