A Web Information Extraction System to DB Prototyping

  • Authors:
  • P. Moreda;Rafael Muñoz;Patricio Martínez-Barco;Cristina Cachero;Manuel Palomar

  • Affiliations:
  • -;-;-;-;-

  • Venue:
  • NLDB '02 Proceedings of the 6th International Conference on Applications of Natural Language to Information Systems-Revised Papers
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Database prototyping is a technique widely used both to validate user requirements and to verify certain application functionality. These tasks usually require the population of the underlying data structures with sampling data that, additionally, may need to stick to certain restrictions. Although some existing approaches have already automated this population task by means of random data generation, the lack of semantic meaning of the resulting structures may interfere both in the user validation and in the designer verification task.In order to solve this problem and improve the intuitiveness of the resulting prototypes, this paper presents a population system that, departing from the information contained in a UML-compliant Domain Conceptual Model, applies Information Extraction techniques to compile meaningful information sets from texts available through Internet. The system is based on the semantic information extracted from the EWN lexical resource and includes, among other features, a named entity recognition system and an ontology that speed up the prototyping process and improve the quality of the sampling data.