Incremental mining of information interest for personalized web scanning

Authors:
Rey-Long Liu;Wan-Jung Lin
Affiliations:
Department of Information Management, Chung Hua University, No. 707, Sec. 2, Wufu Road, HsinChu, Taiwan 300, Republic of China;Department of Information Management, Chung Hua University, No. 707, Sec. 2, Wufu Road, HsinChu, Taiwan 300, Republic of China
Venue:
Information Systems
Year:
2005

Citing 27
Cited 3

Combining classifiers in text categorization

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Learning routing queries in a query zone

Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Intelligent Adaptive Information Agents

Journal of Intelligent Information Systems - Special issue: adaptive intelligent agents
Real life information retrieval: a study of user queries on the Web

ACM SIGIR Forum
An intelligent personal spider (agent) for dynamic Internet/intranet searching

Decision Support Systems - Special issue: intranets and intranetworking
Environmental scanning and information systems in relation to success in introducing new products

Information and Management
Boosting and Rocchio applied to text filtering

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
“User revealment”—a comparison of initial queries and ensuing question development in online searching and in human reference interactions

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Phrasier: a system for interactive document retrieval using keyphrases

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Adaptive cluster-based browsing using incrementally expanded queries and its effects (poster abstract)

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Focused crawling: a new approach to topic-specific Web resource discovery

WWW '99 Proceedings of the eighth international conference on World Wide Web
Relevance feedback with a small number of relevance judgements: incremental relevance feedback vs. document clustering

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Interactive Internet search: keyword, directory and query reformulation mechanisms compared

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Using Memex to archive and mine community Web browsing experience

Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Finding topic words for hierarchical summarization

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
On the design of a learning crawler for topical resource discovery

ACM Transactions on Information Systems (TOIS)
Hierarchical presentation of expansion terms

Proceedings of the 2002 ACM symposium on Applied computing
Hierarchically Classifying Documents Using Very Few Words

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
ARCCHNID: Adaptive Retrieval Agents Choosing Heuristic Neighborhoods

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Improving Text Classification by Shrinkage in a Hierarchy of Classes

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
A refinement approach to handling model misfit in text categorization

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Collaborative crawling: mining user experiences for topical resource discovery

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Business Environment Scanner for Senior Managers: Towards Active Executive Support with Intelligent Agents

HICSS '98 Proceedings of the Thirty-First Annual Hawaii International Conference on System Sciences-Volume 5 - Volume 5
BI: A Resource-Bounded Information Gathering Agent TITLE2:

BI: A Resource-Bounded Information Gathering Agent TITLE2:
Collaborative Multiagent Adaptation for Business Environmental Scanning Through the Internet

Applied Intelligence
Syskill & webert: Identifying interesting web sites

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1

Dynamic interaction in knowledge based systems: An exploratory investigation and empirical evaluation

Decision Support Systems
Medical query generation by term-category correlation

Information Processing and Management: an International Journal
Dynamic category profiling for text filtering and classification

PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Businesses and people often organize their information of interest (IOI) into a hierarchy of folders (or categories). The personalized folder hierarchy provides a natural way for each of the users to manage and utilize his/her IOI (a folder corresponds to an interest type). Since the interest is relatively long-term, continuous web scanning is essential. It should be directed by precise and comprehensible specifications of the interest. A precise specification may direct the scanner to those spaces that deserve scanning, while a specification comprehensible to the user may facilitate manual refinement, and a specification comprehensible to information providers (e.g. Internet search engines) may facilitate the identification of proper seed sites to start scanning. However, expressing such specifications is quite difficult (and even implausible) for the user, since each interest type is often implicitly and collectively defined by the content (i.e. documents) of the corresponding folder, which may even evolve over time. In this paper, we present an incremental text mining technique to efficiently identify the user's current interest by mining the user's information folders. The specification mined for each interest type specifies the context of the interest type in conjunctive normal form, which is comprehensible to general users and information providers. The specification is also shown to be more precise in directing the scanner to those sites that are more likely to provide IOI. The user may thus maintain his/her folders and then constantly get IOI, without paying much attention to the difficult tasks of interest specification and seed identification.