Knowledge discovery in web-directories: finding term-relations to build a business ontology

  • Authors:
  • Sandip Debnath;Tracy Mullen;Arun Upneja;C. Lee Giles

  • Affiliations:
  • Department of Computer Sciences and Engineering;School of Information Sciences and Technology, The Pennsylvania State University, University Park, PA;School of Hotel, Restaurant and Recreation Management;,Department of Computer Sciences and Engineering

  • Venue:
  • EC-Web'05 Proceedings of the 6th international conference on E-Commerce and Web Technologies
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Web continues to grow at a tremendous rate. Search engines find it increasingly difficult to provide useful results. To manage this explosively large number of Web documents, automatic clustering of documents and organising them into domain dependent directories became very popular. In most cases, these directories represent a hierarchical structure of categories and sub-categories for domains and sub-domains. To fill up these directories with instances, individual documents are automatically analysed and placed into them according to their relevance. Though individual documents in these collections may not be ranked efficiently, combinedly they provide an excellent knowledge source for facilitating ontology construction in that domain. In (mainly automatic) ontology construction steps, we need to find and use relevant knowledge for a particular subject or term. News documents provide excellent relevant and up-to-date knowledge source. In this paper, we focus our attention in building business ontologies. To do that we use news documents from business domains to get an up-to-date knowledge about a particular company. To extract this knowledge in the form of important “terms” related to the company, we apply a novel method to find “related terms” given the company name. We show by examples that our technique can be successfully used to find “related terms” in similar cases.