Rough sets and FCA --- scalability challenges

  • Authors:
  • Dominik Ślęzak

  • Affiliations:
  • Institute of Mathematics, University of Warsaw, Warsaw, Poland,Infobright Inc., Warsaw, Poland

  • Venue:
  • ICFCA'12 Proceedings of the 10th international conference on Formal Concept Analysis
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Rough Sets (RS) [1,2,3] and Formal Concept Analysis (FCA) [4,5] provide foundations for a number of methods useful in data mining and knowledge discovery at different stages of data preprocessing, classification and representation. RS and FCA are often applied together with other techniques in order to cope with real-world challenges. It is therefore important to investigate various ways of extending RS/FCA notions and algorithms in order to facilitate dealing with truly large and complex data. This talk attempts to categorize some ideas of how to scale RS and FCA methods with respect to a number of objects and attributes, as well as types and cardinalities of attribute values. We discuss a usage of analytical database engines [6] and randomized heuristics [7] to compute approximate, yet meaningful results. We also discuss differences and similarities in algorithmic bottlenecks related to RS and FCA, illustrating that these approaches should be regarded as complementary rather than competing methodologies. As a case study, we consider the tasks of data analysis and knowledge representation arising within a research project aiming at enhancing semantic search of diverse types of content in a large repository of scientific articles [8].