Scalable semantic annotation of text using lexical and web resources

Authors:
Elias Zavitsanos;George Tsatsaronis;Iraklis Varlamis;Georgios Paliouras
Affiliations:
Institute of Informatics & Telecommunications, NCSR “Demokritos”;Department of Computer and Information Science, Norwegian University of Science and Technology;Department of Informatics and Telematics, Harokopio University;Institute of Informatics & Telecommunications, NCSR “Demokritos”
Venue:
SETN'10 Proceedings of the 6th Hellenic conference on Artificial Intelligence: theories, models and applications
Year:
2010

Citing 12
Cited 3

Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone

SIGDOC '86 Proceedings of the 5th annual international conference on Systems documentation
Using corpus statistics and WordNet relations for sense identification

Computational Linguistics - Special issue on word sense disambiguation
Word-sense disambiguation using statistical models of Roget's categories trained on large corpora

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Gimme' the context: context-driven automatic semantic annotation with C-PANKOW

WWW '05 Proceedings of the 14th international conference on World Wide Web
Using Data-Extraction Ontologies to Foster Automating Semantic Annotation

ICDEW '06 Proceedings of the 22nd International Conference on Data Engineering Workshops
Evaluating WordNet-based Measures of Lexical Semantic Relatedness

Computational Linguistics
Ontology based annotation of text segments

Proceedings of the 2007 ACM symposium on Applied computing
Ontology based Text Annotation --OnTeA

Proceedings of the 2007 conference on Information Modelling and Knowledge Bases XVIII
Omiotis: A Thesaurus-Based Measure of Text Relatedness

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
Computing semantic relatedness using Wikipedia-based explicit semantic analysis

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Text relatedness based on a word thesaurus

Journal of Artificial Intelligence Research
An experimental study on unsupervised graph-based word sense disambiguation

CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing

KDTA: automated knowledge-driven text annotation

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
Learning to tag text from rules and examples

AI*IA'11 Proceedings of the 12th international conference on Artificial intelligence around man and beyond
An automatic approach for ontology-based feature extraction from heterogeneous textualresources

Engineering Applications of Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we are dealing with the task of adding domain-specific semantic tags to a document, based solely on the domain ontology and generic lexical and Web resources In this manner, we avoid the need for trained domain-specific lexical resources, which hinder the scalability of semantic annotation More specifically, the proposed method maps the content of the document to concepts of the ontology, using the WordNet lexicon and Wikipedia The method comprises a novel combination of measures of semantic relatedness and word sense disambiguation techniques to identify the most related ontology concepts for the document We test the method on two case studies: (a) a set of summaries, accompanying environmental news videos, (b) a set of medical abstracts The results in both cases show that the proposed method achieves reasonable performance, thus pointing to a promising path for scalable semantic annotation of documents.