A federated architecture for information management
ACM Transactions on Information Systems (TOIS)
Federated database systems for managing distributed, heterogeneous, and autonomous databases
ACM Computing Surveys (CSUR) - Special issue on heterogeneous databases
Communications of the ACM
Answering queries using views (extended abstract)
PODS '95 Proceedings of the fourteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
The TSIMMIS Approach to Mediation: Data Models and Languages
Journal of Intelligent Information Systems - Special issue: next generation information technologies and systems
Rewriting queries using views in description logics
PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Answering recursive queries using views
PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Complexity of answering queries using materialized views
PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Database techniques for the World-Wide Web: a survey
ACM SIGMOD Record
Distributed and Parallel Databases
Learning object identification rules for information integration
Information Systems - Data extraction, cleaning and reconciliation
Fundamentals of Data Warehouses
Fundamentals of Data Warehouses
Information Integration Using Logical Views
ICDT '97 Proceedings of the 6th International Conference on Database Theory
Don't Scrap It, Wrap It! A Wrapper Architecture for Legacy Data Sources
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
A Scalable Algorithm for Answering Queries Using Views
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Declarative Data Cleaning: Language, Model, and Algorithms
Proceedings of the 27th International Conference on Very Large Data Bases
Querying Heterogeneous Information Sources Using Source Descriptions
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Towards a Standard for a Multilingual Lexical Entry: The EAGLES/ISLE Initiative
CICLing '02 Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing
Query Transformation for PSJ-Queries
VLDB '87 Proceedings of the 13th International Conference on Very Large Data Bases
Combining multiple, large-scale resources in a reusable lexicon for natural language generation
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Lexical knowledge representation in an intelligent dictionary help system
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Putting pieces together: combining FrameNet, VerbNet and WordNet for robust semantic parsing
CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
Hi-index | 0.00 |
The design and construction of lexical resources is a critical issue in Natural Language Processing (NLP). Real-world NLP systems need large-scale lexica, which provide rich information about words and word senses at all levels: morphologic, syntactic, lexical semantics, etc., but the construction of lexical resources is a difficult and costly task. The last decade has been highly influenced by the notion of reusability, that is, the use of the information of existing lexical resources in constructing new ones. It is unrealistic, however, to expect that the great variety of available lexical information resources could be converted into a single and standard representation schema in the near future. The purpose of this article is to present the ELHISA system, a software architecture for the integration of heterogeneous lexical information. We address, from the point of view of the information integration area, the problem of querying very different existing lexical information sources using a unique and common query language. The integration in ELHISA is performed in a logical way, so that the lexical resources do not suffer any modification when integrating them into the system. ELHISA is primarily defined as a consultation system for accessing structured lexical information, and therefore it does not have the capability to modify or update the underlying information. For this purpose, a General Conceptual Model (GCM) for describing diverse lexical data has been conceived. The GCM establishes a fixed vocabulary describing objects in the lexical information domain, their attributes, and the relationships among them. To integrate the lexical resources into the federation, a Source Conceptual Model (SCM) is built on the top of each one, which represents the lexical objects concurring in each particular source. To answer the user queries, ELHISA must access the integrated resources, and, hence, it must translate the query expressed in GCM terms into queries formulated in terms of the SCM of each source. The relation between the GCM and the SCMs is explicitly described by means of mapping rules called Content Description Rules. Data integration at the extensional level is achieved by means of the data cleansing process, needed if we want to compare the data arriving from different sources. In this process, the object identification step is carried out. Based on this architecture, a prototype named ELHISA has been built, and five resources covering a broad scope have been integrated into it so far for testing purposes. The fact that such heterogeneous resources have been integrated with ease into the system shows, in the opinion of the authors, the suitability of the approach taken.