Exploring models for semantic category verification

Authors:
Dmitri Roussinov;Ozgur Turetken
Affiliations:
Department of Computer and Information Sciences, University of Strathclyde, L13.29 Livingstone Tower, 16 Richmond Street, Glasgow G1 1XQ, United Kingdom;Institute of Innovation and Technology Management, Ted Rogers School of Information Technology Management, Ryerson University, 575 Bay Street, Toronto, Ont., Canada M5G 2C5
Venue:
Information Systems
Year:
2009

Citing 26
Cited 0

Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging

Computational Linguistics
IR evaluation methods for retrieving highly relevant documents

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Scaling question answering to the Web

Proceedings of the 10th international conference on World Wide Web
Learning search engine specific query transformations for question answering

Proceedings of the 10th international conference on World Wide Web
Mining the web for answers to natural language questions

Proceedings of the tenth international conference on Information and knowledge management
On the MSE robustness of batching estimators

Proceedings of the 33nd conference on Winter simulation
Cumulated gain-based evaluation of IR techniques

ACM Transactions on Information Systems (TOIS)
Extending a Lexical Ontology by a Combination of Distributional Semantics Signatures

EKAW '02 Proceedings of the 13th International Conference on Knowledge Engineering and Knowledge Management. Ontologies and the Semantic Web
Extracting Patterns and Relations from the World Wide Web

WebDB '98 Selected papers from the International Workshop on The World Wide Web and Databases
Text Mining for Causal Relations

Proceedings of the Fifteenth International Florida Artificial Intelligence Research Society Conference
SemTag and seeker: bootstrapping the semantic web via automated semantic annotation

WWW '03 Proceedings of the 12th international conference on World Wide Web
Towards the self-annotating web

Proceedings of the 13th international conference on World Wide Web
Automatic acquisition of hyponyms from large text corpora

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Finding parts in very large corpora

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Probabilistic question answering on the Web: Research Articles

Journal of the American Society for Information Science and Technology
Gimme' the context: context-driven automatic semantic annotation with C-PANKOW

WWW '05 Proceedings of the 14th international conference on World Wide Web
Fine grained classification of named entities

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
The role of lexico-semantic feedback in open-domain textual question-answering

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Performance issues and error analysis in an open-domain Question Answering system

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Learning surface text patterns for a Question Answering system

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Evaluation of resources for question answering evaluation

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Unsupervised named-entity extraction from the web: an experimental study

Artificial Intelligence
Beyond keywords: Automated question answering on the web

Communications of the ACM - Enterprise information integration: and other tools for merging data
Detecting Word Substitutions in Text

IEEE Transactions on Knowledge and Data Engineering
A probabilistic model of redundancy in information extraction

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Corpus-based thesaurus construction for image retrieval in specialist domains

ECIR'03 Proceedings of the 25th European conference on IR research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many artificial intelligence tasks, such as automated question answering, reasoning, or heterogeneous database integration, involve verification of a semantic category (e.g. ''coffee'' is a drink, ''red'' is a color, while ''steak'' is not a drink and ''big'' is not a color). In this research, we explore completely automated on-the-fly verification of a membership in any arbitrary category which has not been expected a priori. Our approach does not rely on any manually codified knowledge (such as WordNet or Wikipedia) but instead capitalizes on the diversity of topics and word usage on the World Wide Web, thus can be considered ''knowledge-light'' and complementary to the ''knowledge-intensive'' approaches. We have created a quantitative verification model and established (1) what specific variables are important and (2) what ranges and upper limits of accuracy are attainable. While our semantic verification algorithm is entirely self-contained (not involving any previously reported components that are beyond the scope of this paper), we have tested it empirically within our fact seeking engine on the well known TREC conference test questions. Due to our implementation of semantic verification, the answer accuracy has improved by up to 16% depending on the specific models and metrics used.