Crowdsourced comprehension: predicting prerequisite structure in Wikipedia

Authors:
Partha Pratim Talukdar;William W. Cohen
Affiliations:
Carnegie Mellon University;Carnegie Mellon University
Venue:
Proceedings of the Seventh Workshop on Building Educational Applications Using NLP
Year:
2012

Citing 15
Cited 0

Scaling question answering to the web

ACM Transactions on Information Systems (TOIS)
Web Usage Mining as a Tool for Personalization: A Survey

User Modeling and User-Adapted Interaction
Empirical Study on Collaborative Writing: What Do Co-authors Do, Use, and Like?

Computer Supported Cooperative Work
Co-authoring with structured annotations

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Local Graph Partitioning using PageRank Vectors

FOCS '06 Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science
Fast Random Walk with Restart and Its Applications

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Adaptable and Adaptive Hypermedia Systems

Adaptable and Adaptive Hypermedia Systems
Wikify!: linking documents to encyclopedic knowledge

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Learning to link with wikipedia

Proceedings of the 17th ACM conference on Information and knowledge management
Machine reading

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
TextRunner: open information extraction on the web

NAACL-Demonstrations '07 Proceedings of Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations
WikiWalk: random walks on Wikipedia for semantic relatedness

TextGraphs-4 Proceedings of the 2009 Workshop on Graph-based Methods for Natural Language Processing
DBpedia: a nucleus for a web of open data

ISWC'07/ASWC'07 Proceedings of the 6th international The semantic web and 2nd Asian conference on Asian semantic web conference
The Shallows: What the Internet Is Doing to Our Brains

The Shallows: What the Internet Is Doing to Our Brains
Enriching textbooks through data mining

Proceedings of the First ACM Symposium on Computing for Development

Quantified Score

Hi-index	0.00

Visualization

Abstract

The growth of open-access technical publications and other open-domain textual information sources means that there is an increasing amount of online technical material that is in principle available to all, but in practice, incomprehensible to most. We propose to address the task of helping readers comprehend complex technical material, by using statistical methods to model the "prerequisite structure" of a corpus --- i.e., the semantic impact of documents on an individual reader's state of knowledge. Experimental results using Wikipedia as the corpus suggest that this task can be approached by crowd-sourcing the production of ground-truth labels regarding prerequisite structure, and then generalizing these labels using a learned classifier which combines signals of various sorts. The features that we consider relate pairs of pages by analyzing not only textual features of the pages, but also how the containing corpora is connected and created.