Discovering unexpected information from your competitors' web sites

Authors:
Bing Liu;Yiming Ma;Philip S. Yu
Affiliations:
School of Computing, National University of Singapore, Singapore 117543;School of Computing, National University of Singapore, Singapore 117543;IBM T. J. Watson Research Center, Yorktown Heights, NY
Venue:
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2001

Citing 21
Cited 31

Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
Wrapper generation for semi-structured Internet sources

ACM SIGMOD Record
Efficiently mining long patterns from databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Integration of heterogeneous databases without common domains using queries based on textual similarity

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Inferring Web communities from link topology

Proceedings of the ninth ACM conference on Hypertext and hypermedia : links, objects, time and space---structure in hypermedia systems: links, objects, time and space---structure in hypermedia systems
Database techniques for the World-Wide Web: a survey

ACM SIGMOD Record
User-centered push for timely information delivery

WWW7 Proceedings of the seventh international conference on World Wide Web 7
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Finding related pages in the World Wide Web

WWW '99 Proceedings of the eighth international conference on World Wide Web
KPS: a Web information mining algorithm

WWW '99 Proceedings of the eighth international conference on World Wide Web
Focused crawling: a new approach to topic-specific Web resource discovery

WWW '99 Proceedings of the eighth international conference on World Wide Web
Authoritative sources in a hyperlinked environment

Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Complex queries in XML-GL

SAC '00 Proceedings of the 2000 ACM symposium on Applied computing - Volume 2
Small is beautiful: discovering the minimal set of unexpected patterns

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
A framework for specifying explicit bias for revision of approximate information extraction rules

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Learning to construct knowledge bases from the World Wide Web

Artificial Intelligence - Special issue on Intelligent internet systems
Modern Information Retrieval

Modern Information Retrieval
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
What Makes Patterns Interesting in Knowledge Discovery Systems

IEEE Transactions on Knowledge and Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Post-analysis of learned rules

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1

Visualizing web site comparisons

Proceedings of the 11th international conference on World Wide Web
A system for real-time competitive market intelligence

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Distributed data mining in a chain store database of short transactions

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
A comparative web browser (CWB) for browsing and comparing web pages

WWW '03 Proceedings of the 12th international conference on World Wide Web
Editorial: special issue on web content mining

ACM SIGKDD Explorations Newsletter
Web mining from competitors' websites

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
B-CWB: Bilingual Comparative Web Browser Based on Content-Synchronization and Viewpoint Retrieval

World Wide Web
CWS: a comparative web search system

Proceedings of the 15th international conference on World Wide Web
Using the structure of documents to improve the discovery of unexpected information

Proceedings of the 2006 ACM symposium on Applied computing
Very sparse random projections

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Very sparse stable random projections for dimension reduction in lα (0

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
AISIID: An artificial immune system for interesting information discovery on the web

Applied Soft Computing
Identifying a hierarchy of bipartite subgraphs for web site abstraction

Web Intelligence and Agent Systems
Efficient algorithms for incremental Web log mining with dynamic thresholds

The VLDB Journal — The International Journal on Very Large Data Bases
Fuzzy Clustering for Topic Analysis and Summarization of Document Collections

CAI '07 Proceedings of the 20th conference of the Canadian Society for Computational Studies of Intelligence on Advances in Artificial Intelligence
Extracting Advantage Phrases That Hint at a New Technology's Potentials

PAKM '08 Proceedings of the 7th International Conference on Practical Aspects of Knowledge Management
Keyword Enhanced Web Structure Mining for Business Intelligence

Advanced Internet Based Systems and Applications
Discovering unexpected documents in corpora

Knowledge-Based Systems
Discovering special product features for improving the process of product selection in E-commerce environment

Proceedings of the 11th International Conference on Electronic Commerce
A comprehensive survey of numeric and symbolic outlier mining techniques

Intelligent Data Analysis
Ontology based web mining for information gathering

WImBI'06 Proceedings of the 1st WICI international conference on Web intelligence meets brain informatics
WebUser: mining unexpected web usage

International Journal of Business Intelligence and Data Mining
Discovering intermediate entities from two examples by using web search engine indices

Proceedings of the 4th International Conference on Uniquitous Information Management and Communication
Hybrid approach to web content outlier mining without query vector

DaWaK'05 Proceedings of the 7th international conference on Data Warehousing and Knowledge Discovery
NASC: a novel approach for spam classification

ICIC'06 Proceedings of the 2006 international conference on Computational Intelligence and Bioinformatics - Volume Part III
An approach to extract special skills to improve the performance of resume selection

DNIS'10 Proceedings of the 6th international conference on Databases in Networked Information Systems
Mining special features to improve the performance of e-commerce product selection and resume processing

International Journal of Computational Science and Engineering
How the web can help Wikipedia: a study on information complementation of Wikipedia by the web

Proceedings of the 6th International Conference on Ubiquitous Information Management and Communication
Social tie mining in company networks

Proceedings of the ACM SIGKDD Workshop on Mining Data Semantics
Mining competitive relationships by learning across heterogeneous networks

Proceedings of the 21st ACM international conference on Information and knowledge management
Discovering unexpected information on the basis of popularity/unpopularity analysis of coordinate objects and their relationships

Proceedings of the 28th Annual ACM Symposium on Applied Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Ever since the beginning of the Web, finding useful information from the Web has been an important problem. Existing approaches include keyword-based search, wrapper-based information extraction, Web query and user preferences. These approaches essentially find information that matches the user's explicit specifications. This paper argues that this is insufficient. There is another type of information that is also of great interest, i.e., unexpected information, which is unanticipated by the user. Finding unexpected information is useful in many applications. For example, it is useful for a company to find unexpected information bout its competitors, e.g., unexpected services and products that its competitors offer. With this information, the company can learn from its competitors and/or design counter measures to improve its competitiveness. Since the number of pages of a typical commercial site is very large and there are also many relevant sites (competitors), it is very difficult for a human user to view each page to discover the unexpected information. Automated assistance is needed. In this paper, we propose a number of methods to help the user find various types of unexpected information from his/her competitors' Web sites. Experiment results show that these techniques are very useful in practice and also efficient.