A model for evaluating the quality of user-created documents

Authors:
Linh Hoang;Jung-Tae Lee;Young-In Song;Hae-Chang Rim
Affiliations:
Dept. of Computer and Radio Communications Engineering, Korea University, Seoul, Korea;Dept. of Computer and Radio Communications Engineering, Korea University, Seoul, Korea;Dept. of Computer and Radio Communications Engineering, Korea University, Seoul, Korea;Dept. of Computer and Radio Communications Engineering, Korea University, Seoul, Korea
Venue:
AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
Year:
2008

Citing 5
Cited 1

A comparison of algorithms for maximum entropy parameter estimation

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Learning extraction patterns for subjective expressions

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
A framework to predict the quality of answers with non-textual features

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Identifying comparative sentences in text documents

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Automatically assessing review helpfulness

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing

Analyzing Online Review Helpfulness Using a Regressional ReliefF-Enhanced Text Mining Method

ACM Transactions on Management Information Systems (TMIS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we propose a model for evaluating the quality of general user-created documents. The model is based on supervised classification approach, in which output scores are considered as quality of given document. In order to utilize both textual and nontextual attributes of documents, we incorporated a number of objectively measurable, real-valued features selected upon predefined criteria for quality. Experiments on two datasets of real world documents show that textual features are stable indicators for evaluating documents' quality. Some features are inferred to be effective for general kinds of documents.