ACM Computing Surveys (CSUR)
IEPAD: information extraction based on pattern discovery
Proceedings of the 10th international conference on World Wide Web
SE-LEGO: creating metasearch engines on demand
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Mining data records in Web pages
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Editorial: special issue on web content mining
ACM SIGKDD Explorations Newsletter
Hi-index | 0.00 |
Usually, Web applications such as deep Web crawlers, metasearch engines, and other Web mining systems need to extract information displayed in the form of result records on response pages returned by search engines in response to submitted queries. Extracting such records is challenging as search engines are heterogeneous in displaying their records. In addition, response pages returned by many search engines include other noisy content such as advertisements, suggestion links, etc., which make the extraction task even more complicated. In this paper, we propose a highly effective and efficient algorithm for automatically mining result records from search engine response pages.