A Validity Measure for Fuzzy Clustering
IEEE Transactions on Pattern Analysis and Machine Intelligence
Mining database structure; or, how to build a data quality browser
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Data Quality Requirements Analysis and Modeling
Proceedings of the Ninth International Conference on Data Engineering
Data Quality in Web Information Systems
ER '02 Proceedings of the 21st International Conference on Conceptual Modeling
A framework for analysis of data freshness
Proceedings of the 2004 international workshop on Information quality in information systems
Computing trust from revision history
Proceedings of the 2006 International Conference on Privacy, Security and Trust: Bridge the Gap Between PST Technologies and Business Services
Efficient Mining of Closed Repetitive Gapped Subsequences from a Sequence Database
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries
On multiview-based meta-learning for automatic quality assessment of wiki articles
TPDL'12 Proceedings of the Second international conference on Theory and Practice of Digital Libraries
Hi-index | 0.00 |
The collaborative efforts of users in social media services such as Wikipedia have led to an explosion in user-generated content and how to automatically tag the quality of the content is an eminent concern now. Actually each article is usually undergoing a series of revision phases and the articles of different quality classes exhibit specific revision cycle patterns. We propose to Assess Quality based on Revision History (AQRH) for a specific domain as follows. First, we borrow Hidden Markov Model (HMM) to turn each article's revision history into a revision state sequence. Then, for each quality class its revision cycle patterns are extracted and are clustered into quality corpora. Finally, article's quality is thereby gauged by comparing the article's state sequence with the patterns of pre-classified documents in probabilistic sense. We conduct experiments on a set of Wikipedia articles and the results demonstrate that our method can accurately and objectively capture web article's quality.