A UML profile for the conceptual modelling of structurally complex data: Easing human effort in the KDD process

  • Authors:
  • Juan A. Lara;David Lizcano;María A. Martínez;Juan Pazos;Teresa Riera

  • Affiliations:
  • Open University of Madrid, UDIMA - Facultad de Enseñanzas Técnicas, Ctra. De la Coruña, Km 38.500 - Vía de Servicio 15 - 28400, Collado Villalba, Madrid, Spain;Open University of Madrid, UDIMA - Facultad de Enseñanzas Técnicas, Ctra. De la Coruña, Km 38.500 - Vía de Servicio 15 - 28400, Collado Villalba, Madrid, Spain;Open University of Madrid, UDIMA - Facultad de Enseñanzas Técnicas, Ctra. De la Coruña, Km 38.500 - Vía de Servicio 15 - 28400, Collado Villalba, Madrid, Spain;Technical University of Madrid, School of Computer Science, Campus de Montegancedo, s/n - 28660, Boadilla del Monte, Madrid, Spain;Universidad de las Islas Baleares, Departamento de Matemáticas e Informática, Edificio Anselm Turmeda, Crta. Valldemossa, Km 7.5 - 07122, Palma de Mallorca, Spain

  • Venue:
  • Information and Software Technology
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

Context: Domains where data have a complex structure requiring new approaches for knowledge discovery from data are on the increase. In such domains, the information related to each object under analysis may be composed of a very broad set of interrelated data instead of being represented by a simple attribute table. This further complicates their analysis. Objective: It is becoming more and more necessary to model data before analysis in order to assure that they are properly understood, stored and later processed. On this ground, we have proposed a UML extension that is able to represent any set of structurally complex hierarchically ordered data. Conceptually modelled data are human comprehensible and constitute the starting point for automating other data analysis tasks, such as comparing items or generating reference models. Method: The proposed notation has been applied to structurally complex data from the stabilometry field. Stabilometry is a medical discipline concerned with human balance. We have organized the model data through an implementation based on XML syntax. Results: We have applied data mining techniques to the resulting structured data for knowledge discovery. The sound results of modelling a domain with such complex and wide-ranging data confirm the utility of the approach. Conclusion: The conceptual modelling and the analysis of non-conventional data are important challenges. We have proposed a UML profile that has been tested on data from a medical domain, obtaining very satisfactory results. The notation is useful for understanding domain data and automating knowledge discovery tasks.