Developing Document Analysis and Data Extraction Tools for Entity Modelling

Authors:
Heather Fulford
Affiliations:
-
Venue:
NLDB '00 Proceedings of the 5th International Conference on Applications of Natural Language to Information Systems-Revised Papers
Year:
2000

Citing 10
Cited 0

A new technique for identifying scientific/technical terms and describing science texts

Literary & Linguistic Computing
An introduction to data and activity analysis

An introduction to data and activity analysis
Software development process from natural language specification

ICSE '89 Proceedings of the 11th international conference on Software engineering
From data to database (2nd ed.)

From data to database (2nd ed.)
A system for the semiautomatic generation of E-R models from natural language specifications

Data & Knowledge Engineering
Program design by informal English descriptions

Communications of the ACM
Essence of Systems Analysis Techniques

Essence of Systems Analysis Techniques
Database Systems Concepts

Database Systems Concepts
Database Systems: A Practical Approach to Design, Implementation and Management 2nd Ed.

Database Systems: A Practical Approach to Design, Implementation and Management 2nd Ed.
Transformation of Requirement Specifications Expressed in Natural Language into an EER Model

ER '93 Proceedings of the 12th International Conference on the Entity-Relationship Approach: Entity-Relationship Approach

Quantified Score

Hi-index	0.00

Visualization

Abstract

The entity-relationship approach to conceptual modelling for database design conventionally begins with the analysis of natural language system specifications to identify entities, attributes, and relationships in preparation for the creation of entity models represented in entity-relationship diagrams. This task of document scanning can be both time-consuming and complex, often requiring linguistic knowledge, subject domain knowledge, judgement and intuition. To help alleviate the burden of this aspect of database design, we present some of our research into the development of tools for analysing natural language specifications and extracting candidate entities, attributes, and relationships. Drawing on research in corpus linguistics and terminology science, our research relies on an examination of patterns of word co-occurrence and the use of "linguistic cues". We indicate how we intend integrating our tools into a CASE environment to support database designers during each stage of their work, from the analysis of system specifications through to code generation.