Acquisition of Soft Taxonomies for Intelligent Personal Hierarchies and the Soft Semantic Web

  • Authors:
  • T. P. Martin;B. Azvine

  • Affiliations:
  • -;-

  • Venue:
  • BT Technology Journal
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Information overload is a problem at an individual and a corporate level. Many solutions have been proposed, including knowledge management, data warehouses, service directories and digital libraries. The semantic Web aims to unify many of these approaches by appropriate markup and agreement on the meaning of the markup. At the individual's level, these techniques partially solve the problem by classifying documents within hierarchical structures and enabling searching and browsing of the documents. However, they also contribute to the problem as there is no unique categorisation and access structure that suits every individual. Finding the right document becomes a two-stage process — first find the right place in the categorisation scheme, then find the document within that class.In addition to enterprise-wide sources, individual information sources include e-mails, electronic documents in many formats, personal and group filespaces, notes, diary entries, etc. These are unlikely to conform to the enterprise categorisation but form useful resources nevertheless.The idea of an intelligent personal hierarchy for information (iPHI) is to auto-configure access to multiple sources of information based on personal categories. This entails fuzzy matching of meta-data structure as well as content. Metadata is a powerful tool in intelligent information management; however, it is not necessarily uniform, either in label or in content. One document's ‘author’ is another's ‘creator’; ‘John Smith’, ‘Smith, John’ and ‘J.Smith’ all refer to the same individual but are syntactically different.Fusion (or intelligent integration) of information takes place in an environment where the data may be of varying quality, and some may be incomplete or inconsistent. Combining metadata (and the associated data) is not possible without knowing (or learning) the mappings between their ontologies. Such mappings are likely to be soft, i.e. approximate — different sources arise from different designers with different world views. Soft computing is vital to tackle these problems. Frequently, data sources are organised implicitly, according to an internal ontology or taxonomy. Knowing this ontology or taxonomy is a necessary first step to using it in the fusion process. The work described in this paper extracts the implicit taxonomy and enables a user's interaction with the data (e.g. searching) to be expressed in their preferred terms rather than those used by the system.