Crawling and Extracting Process Data from the Web

  • Authors:
  • Yaling Liu;Arvin Agah

  • Affiliations:
  • Department of Electrical Engineering & Computer Science, The University of Kansas, Lawrence, USA 66045-7621;Department of Electrical Engineering & Computer Science, The University of Kansas, Lawrence, USA 66045-7621

  • Venue:
  • ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we address the design and implementation of a supporting system for process-based searches. This supporting system can efficiently crawl the Web and extract processes from obtained data. The retrieved processes can then be used in a Process-Based Search Engine (PBSE). In this work, a process is defined as a sequence of activities for achieving a goal. A PBSE uses the extracted processes to transform an original query into multiple sub-queries, and then performs keyword search for each transformed sub-query. To facilitate effective process-based searches, a large number of high quality processes are required. This paper focuses on how to efficiently and effectively build a database of processes by exploring the Web.