Text content reliability estimation in web documents: a new proposal

Authors:
Luis Sanz;Héctor Allende;Marcelo Mendoza
Affiliations:
Department of Informatics, Universidad Técnica Federico Santa María, Chile;Department of Informatics, Universidad Técnica Federico Santa María, Chile;Department of Informatics, Universidad Técnica Federico Santa María, Chile
Venue:
CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part II
Year:
2012

Citing 9
Cited 0

Judgement of information quality and cognitive authority in the Web

Journal of the American Society for Information Science and Technology
Relevance judgment: What do information users consider beyond topicality?

Journal of the American Society for Information Science and Technology - Research Articles
Making sense of credibility on the Web: Models for evaluating online information and recommendations for future research

Journal of the American Society for Information Science and Technology
QuWi: quality control in Wikipedia

Proceedings of the 3rd workshop on Information credibility on the web
Blog credibility ranking by exploiting verified content

Proceedings of the 3rd workshop on Information credibility on the web
A discourse commitment-based framework for recognizing textual entailment

RTE '07 Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing
Towards Extensible Textual Entailment Engines: The EDITS Package

AI*IA '09: Proceedings of the XIth International Conference of the Italian Association for Artificial Intelligence Reggio Emilia on Emergent Perspectives in Artificial Intelligence
Time: a method of detecting the dynamic variances of trust

Proceedings of the 4th workshop on Information credibility
Towards the measurement of Arabic Weblogs credibility automatically

Proceedings of the 11th International Conference on Information Integration and Web-based Applications & Services

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper illustrates how a combination of information retrieval, machine learning, and NLP corpus annotation techniques was applied to a problem of text content reliability estimation in Web documents. Our proposal for text content reliability estimation is based on a model in which reliability is a similarity measure between the content of the documents and a knowledge corpus. The proposal includes a new representation of text which uses entailment-based graphs. Then we use the graph-based representations as training instances for a machine learning algorithm allowing to build a reliability model. Experimental results illustrate the feasibility of our proposal by performing a comparison with a state-of-the-art method.