An automatic approach for ontology-based feature extraction from heterogeneous textualresources

Authors:
Carlos Vicient;David SáNchez;Antonio Moreno
Affiliations:
Intelligent Technologies for Advanced Knowledge Acquisition (ITAKA), Departament d'Enginyeria Informítica i Matemítiques, Universitat Rovira i Virgili. Av. Països Catalans, 26, 4300 ...;Intelligent Technologies for Advanced Knowledge Acquisition (ITAKA), Departament d'Enginyeria Informítica i Matemítiques, Universitat Rovira i Virgili. Av. Països Catalans, 26, 4300 ...;Intelligent Technologies for Advanced Knowledge Acquisition (ITAKA), Departament d'Enginyeria Informítica i Matemítiques, Universitat Rovira i Virgili. Av. Països Catalans, 26, 4300 ...
Venue:
Engineering Applications of Artificial Intelligence
Year:
2013

Citing 40
Cited 1

An algorithm for suffix stripping

Readings in information retrieval
Deriving concept hierarchies from text

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Mining the Web for Synonyms: PMI-IR versus LSA on TOEFL

EMCL '01 Proceedings of the 12th European Conference on Machine Learning
User-System Cooperation in Document Annotation Based on Information Extraction

EKAW '02 Proceedings of the 13th International Conference on Knowledge Engineering and Knowledge Management. Ontologies and the Semantic Web
Bootstrapping an ontology-based information extraction system

Intelligent exploration of the web
Bottom-up relational learning of pattern matching rules for information extraction

The Journal of Machine Learning Research
FilmEd - Collaborative Video Indexing, Annotation and Discussion Tools Over Broadband Networks

MMM '04 Proceedings of the 10th International Multimedia Modelling Conference
Using corpus-derived name lists for named entity recognition

ANLC '00 Proceedings of the sixth conference on Applied natural language processing
A workbench for finding structure in texts

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Automatic retrieval and clustering of similar words

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Web-scale information extraction in knowitall: (preliminary results)

Proceedings of the 13th international conference on World Wide Web
Towards the self-annotating web

Proceedings of the 13th international conference on World Wide Web
Acquisition of categorized named entities for web search

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Finding parts in very large corpora

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Automatic construction of a hypernym-labeled noun hierarchy from text

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Gimme' the context: context-driven automatic semantic annotation with C-PANKOW

WWW '05 Proceedings of the 14th international conference on World Wide Web
Fine grained classification of named entities

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
KnowItNow: fast, scalable information extraction from the web

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
OntoCAPE-A large-scale ontology for chemical process engineering

Engineering Applications of Artificial Intelligence
Learning non-taxonomic relationships from web documents for domain ontology construction

Data & Knowledge Engineering
k-ANMI: A mutual information based clustering algorithm for categorical data

Information Fusion
Pattern-based automatic taxonomy learning from the Web

AI Communications
Ontology-based information extraction and integration from heterogeneous data sources

International Journal of Human-Computer Studies
Locating complex named entities in web text

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Unsupervised named-entity extraction from the Web: An experimental study

Artificial Intelligence
Cerno: Light-weight tool support for semantic annotation of textual documents

Data & Knowledge Engineering
Semantic annotation, indexing, and retrieval

Web Semantics: Science, Services and Agents on the World Wide Web
A methodology to learn ontological attributes from the Web

Data & Knowledge Engineering
Processing natural language without natural language processing

CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing
Ontology-driven web-based semantic similarity

Journal of Intelligent Information Systems
Semantic Clustering Using Multiple Ontologies

Proceedings of the 2010 conference on Artificial Intelligence Research and Development: Proceedings of the 13th International Conference of the Catalan Association for Artificial Intelligence
Automatic extraction of acronym definitions from the Web

Applied Intelligence
Content annotation for the semantic web: an automatic web-based approach

Knowledge and Information Systems
On the declassification of confidential documents

MDAI'11 Proceedings of the 8th international conference on Modeling decisions for artificial intelligence
Editorial: Special issue on semantic information and engineering systems

Engineering Applications of Artificial Intelligence
Extracting significant Website Key Objects: A Semantic Web mining approach

Engineering Applications of Artificial Intelligence
Semantic-ART: a framework for semantic annotation of regulatory text

Proceedings of the fourth workshop on Exploiting semantic annotations in information retrieval
Learning relation axioms from text: An automatic Web-based approach

Expert Systems with Applications: An International Journal
Two web-based approaches for noun sense disambiguation

CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
Scalable semantic annotation of text using lexical and web resources

SETN'10 Proceedings of the 6th Hellenic conference on Artificial Intelligence: theories, models and applications

Transfer learning of syntactic structures for building taxonomies for search engines

Engineering Applications of Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data mining algorithms such as data classification or clustering methods exploit features of entities to characterise, group or classify them according to their resemblance. In the past, many feature extraction methods focused on the analysis of numerical or categorical properties. In recent years, motivated by the success of the Information Society and the WWW, which has made available enormous amounts of textual electronic resources, researchers have proposed semantic data classification and clustering methods that exploit textual data at a conceptual level. To do so, these methods rely on pre-annotated inputs in which text has been mapped to their formal semantics according to one or several knowledge structures (e.g. ontologies, taxonomies). Hence, they are hampered by the bottleneck introduced by the manual semantic mapping process. To tackle this problem, this paper presents a domain-independent, automatic and unsupervised method to detect relevant features from heterogeneous textual resources, associating them to concepts modelled in a background ontology. The method has been applied to raw text resources and also to semi-structured ones (Wikipedia articles). It has been tested in the Tourism domain, showing promising results.