Browsing Semi-structured Web Texts Using Formal Concept Analysis

  • Authors:
  • Richard Cole;Peter W. Eklund

  • Affiliations:
  • -;-

  • Venue:
  • ICCS '01 Proceedings of the 9th International Conference on Conceptual Structures: Broadening the Base
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Query-directed browsing of unstructured Web-texts using Formal Concept Analysis (FCA) confronts two problems. Firstly on-line Web-data is sometimes unstructured and any FCA-system must include additional mechanisms to structure input sources. Secondly many online collections are large and dynamic so a Web-robot must be used to automatically extract data. These issues are addressed in this paper. We report on the construction of a Web-based FCA system for browsing classified advertisements for real-estate properties. Real-estate advertisements were chosen because they are typical of semi-structured textual information sources accessible on the Web. Furthermore, the analysis of real-estate data using FCA is a classic example used in introductory courses on FCA. However, unlike the classic FCA real-estate example, whose input is a structure relational database, we automatically mine Web-based texts for their structure.