A methodology to learn ontological attributes from the Web

Authors:
David Sánchez
Affiliations:
Intelligent Technologies for Advanced Knowledge Acquisition (ITAKA), Departament d'Enginyeria Informítica i Matemítiques, Universitat Rovira i Virgili, Avda. Països Catalans, 26. 43 ...
Venue:
Data & Knowledge Engineering
Year:
2010

Citing 48
Cited 17

Concepts, attributes and arbitrary relations: some linguistic and ontological criteria for structuring knowledge bases

Data & Knowledge Engineering - Special issue on linguistic instruments in knowledge engineering (LIKE)
The generative lexicon

Computational Linguistics
Knowledge engineering: principles and methods

Data & Knowledge Engineering - Special jubilee issue: DKE 25
Readings in Knowledge Representation

Readings in Knowledge Representation
Enhancing information systems management with natural language processing techniques

Data & Knowledge Engineering - DKE 40
Short Query Linguistic Expansion Techniques: Palliating One-Word Queries by Providing Intermediate Structure to Text

SCIE '97 International Summer School on Information Extraction: A Multidisciplinary Approach to an Emerging Information Technology
Table extraction using conditional random fields

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
A maximum entropy approach to named entity recognition

A maximum entropy approach to named entity recognition
Unsupervised word sense disambiguation rivaling supervised methods

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Automatic acquisition of hyponyms from large text corpora

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Mining and summarizing customer reviews

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Acquisition of categorized named entities for web search

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Swoogle: a search and metadata engine for the semantic web

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Finding parts in very large corpora

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Fine grained classification of named entities

COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Towards Ontology Generation from Tables

World Wide Web
Automatic Discovery of Part-Whole Relations

Computational Linguistics
Ontology Learning and Population from Text: Algorithms, Evaluation and Applications

Ontology Learning and Population from Text: Algorithms, Evaluation and Applications
Espresso: leveraging generic patterns for automatically harvesting semantic relations

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Extracting product features and opinions from reviews

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
KnowItNow: fast, scalable information extraction from the web

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Transforming arbitrary tables into logical form with TARTAR

Data & Knowledge Engineering
Automatising the learning of lexical patterns: An application to the enrichment of WordNet by extracting semantic relationships from Wikipedia

Data & Knowledge Engineering
Semantic deep web: automatic attribute extraction from the deep web data sources

Proceedings of the 2007 ACM symposium on Applied computing
Googleology is Bad Science

Computational Linguistics
The Google Similarity Distance

IEEE Transactions on Knowledge and Data Engineering
The role of documents vs. queries in extracting class attributes from text

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Learning non-taxonomic relationships from web documents for domain ontology construction

Data & Knowledge Engineering
Pattern-based automatic taxonomy learning from the Web

AI Communications
Automatically refining the wikipedia infobox ontology

Proceedings of the 17th international conference on World Wide Web
Using structured text for large-scale attribute extraction

Proceedings of the 17th ACM conference on Information and knowledge management
Semantically driven snippet selection for supporting focused web searches

Data & Knowledge Engineering
Automatic hidden-web table interpretation, conceptualization, and semantic annotation

Data & Knowledge Engineering
MSDA: Wordsense Discrimination Using Context Vectors and Attributes

Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy
Discovery and evaluation of non-taxonomic relations in domain ontologies

International Journal of Metadata, Semantics and Ontologies
An analysis of knowledge collected from volunteer contributors

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Comprehending and generating apt metaphors: a web-driven, case-based approach to figurative language

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Open information extraction from the web

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Semi-supervised learning of attribute-value pairs from product descriptions

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
A semantic similarity metric combining features and intrinsic information content

Data & Knowledge Engineering
Unsupervised named-entity extraction from the Web: An experimental study

Artificial Intelligence
Cerno: Light-weight tool support for semantic annotation of textual documents

Data & Knowledge Engineering
Information extraction for search engines using fast heuristic techniques

Data & Knowledge Engineering
ALLRIGHT: automatic ontology instantiation from tabular web documents

ISWC'07/ASWC'07 Proceedings of the 6th international The semantic web and 2nd Asian conference on Asian semantic web conference
Processing natural language without natural language processing

CICLing'03 Proceedings of the 4th international conference on Computational linguistics and intelligent text processing
Measuring semantic distance using distributional profiles of concepts

Measuring semantic distance using distributional profiles of concepts
On how to perform a gold standard based evaluation of ontology learning

ISWC'06 Proceedings of the 5th international conference on The Semantic Web
Automatic discovery of attribute words from web documents

IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing

Ontology-based information content computation

Knowledge-Based Systems
Enabling search for facts and implied facts in historical documents

Proceedings of the 2011 Workshop on Historical Document Imaging and Processing
Reading between the tags to predict real-world size-class for visually depicted objects in images

MM '11 Proceedings of the 19th ACM international conference on Multimedia
Learning relation axioms from text: An automatic Web-based approach

Expert Systems with Applications: An International Journal
Enabling semantic similarity estimation across multiple ontologies: An evaluation in the biomedical domain

Journal of Biomedical Informatics
Ontology-based semantic similarity: A new feature-based approach

Expert Systems with Applications: An International Journal
Semantic similarity estimation in the biomedical domain: An ontology-based information-theoretic perspective

Journal of Biomedical Informatics
Preventing automatic user profiling in Web 2.0 applications

Knowledge-Based Systems
A semantic similarity method based on information content exploiting multiple ontologies

Expert Systems with Applications: An International Journal
Semantic similarity estimation from multiple ontologies

Applied Intelligence
Using profiling techniques to protect the user's privacy in twitter

MDAI'12 Proceedings of the 9th international conference on Modeling Decisions for Artificial Intelligence
Detecting sensitive information from textual documents: an information-theoretic approach

MDAI'12 Proceedings of the 9th international conference on Modeling Decisions for Artificial Intelligence
A Formal Knowledge Representation System FKRS for the Intelligent Knowledge Base of a Cognitive Learning Engine

International Journal of Software Science and Computational Intelligence
Semantics Discovery via Human Computation Games

International Journal on Semantic Web & Information Systems
A New Model to Compute the Information Content of Concepts from Taxonomic Knowledge

International Journal on Semantic Web & Information Systems
An automatic approach for ontology-based feature extraction from heterogeneous textualresources

Engineering Applications of Artificial Intelligence
Transfer learning of syntactic structures for building taxonomies for search engines

Engineering Applications of Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Class descriptors such as attributes, features or meronyms are rarely considered when developing ontologies. Even WordNet only includes a reduced amount of part-of relationships. However, these data are crucial for defining concepts such as those considered in classical knowledge representation models. Some attempts have been made to extract those relations from text using general meronymy detection patterns; however, there has been very little work on learning expressive class attributes (including associated domain, range or data values) at an ontological level. In this paper we take this background into consideration when proposing and implementing an automatic, non-supervised and domain-independent methodology to extend ontological classes in terms of learning concept attributes, data-types, value ranges and measurement units. In order to present a general solution and minimize the data sparseness of pattern-based approaches, we use the Web as a massive learning corpus to retrieve data and to infer information distribution using highly contextualized queries aimed at improving the quality of the result. This corpus is also automatically updated in an adaptive manner according to the knowledge already acquired and the learning throughput. Results have been manually checked by means of an expert-based concept-per-concept evaluation for several well distinguished domains showing reliable results and a reasonable learning performance.