Measuring the interestingness of articles in a limited user environment

Authors:
R. K. Pon;A. F. Cárdenas;D. J. Buttler;T. J. Critchlow
Affiliations:
University of California, Los Angeles, 420 Westwood Plaza, Los Angeles, CA 90095, United States;University of California, Los Angeles, 420 Westwood Plaza, Los Angeles, CA 90095, United States;Lawrence Livermore National Laboratory, 7000 East Ave., Livermore, CA 94550, United States;Pacific Northwest National Laboratory, PO Box 999, Richland, WA 99352, United States
Venue:
Information Processing and Management: an International Journal
Year:
2011

Citing 45
Cited 1

Selection of relevant features and examples in machine learning

Artificial Intelligence - Special issue on relevance
Decision Tree Induction Based on Efficient Tree Restructuring

Machine Learning
On-line new event detection and tracking

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Boosting and Rocchio applied to text filtering

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A learning agent for wireless news access

Proceedings of the 5th international conference on Intelligent user interfaces
ACTION: automatic classification for full-text documents

ACM SIGIR Forum
Unsupervised and supervised clustering for topic tracking

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Intelligent information triage

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Detection As Multi-Topic Tracking

Information Retrieval
Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm

Machine Learning
Feature selection for high-dimensional genomic microarray data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Using Text Categorization Techniques for Intrusion Detection

Proceedings of the 11th USENIX Security Symposium
Learning Subjective Adjectives from Corpora

Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
MiTAP: A Case Study of Integrated Knowledge Discovery Tools

HICSS '03 Proceedings of the 36th Annual Hawaii International Conference on System Sciences (HICSS'03) - Track 3 - Volume 3
Latent dirichlet allocation

The Journal of Machine Learning Research
An introduction to variable and feature selection

The Journal of Machine Learning Research
Customized Internet news services based on customer profiles

ICEC '03 Proceedings of the 5th international conference on Electronic commerce
Text classification from positive and unlabeled documents

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Evaluating adaptive user profiles for news classification

Proceedings of the 9th international conference on Intelligent user interfaces
Simple Semantics in Topic Detection and Tracking

Information Retrieval
Detecting errors within a corpus using anomaly detection

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
An empirical study of smoothing techniques for language modeling

ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Ranking a stream of news

WWW '05 Proceedings of the 14th international conference on World Wide Web
Detection and prediction of distance-based outliers

Proceedings of the 2005 ACM symposium on Applied computing
Language and task independent text categorization with simple language models

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Feature bagging for outlier detection

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Discovering evolutionary theme patterns from text: an exploration of temporal text mining

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
The case for anomalous link detection

MRDM '05 Proceedings of the 4th international workshop on Multi-relational mining
Learning Subjective Language

Computational Linguistics
From fingerprint to writeprint

Communications of the ACM - Supporting exploratory search
Infoglut

Communications of the ACM - Services science
Improving the estimation of relevance models using large external corpora

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Finding near-duplicate web pages: a large-scale evaluation of algorithms

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic latent query analysis for combining multiple retrieval sources

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Graph-based text classification: learn from your neighbors

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Unifying user-based and item-based collaborative filtering approaches by similarity fusion

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Stylistic text segmentation

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Authorship attribution with thousands of candidate authors

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Single-pass online learning: performance, voting schemes and online feature selection

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
A New Text Categorization Technique Using Distributional Clustering and Learning Logic

IEEE Transactions on Knowledge and Data Engineering
Tracking multiple topics for finding interesting articles

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Relevance: A review of the literature and a framework for thinking on the notion in information science. Part III: Behavior and effects of relevance

Journal of the American Society for Information Science and Technology
Online selection of parameters in the rocchio algorithm for identifying interesting news articles

Proceedings of the 10th ACM workshop on Web information and data management
Estimating continuous distributions in Bayesian classifiers

UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
Analyzing entities and topics in news articles using statistical topic models

ISI'06 Proceedings of the 4th IEEE international conference on Intelligence and Security Informatics

Recommendations of closed consensus temporal patterns by group decision making

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Search engines, such as Google, assign scores to news articles based on their relevance to a query. However, not all relevant articles for the query may be interesting to a user. For example, if the article is old or yields little new information, the article would be uninteresting. Relevance scores do not take into account what makes an article interesting, which would vary from user to user. Although methods such as collaborative filtering have been shown to be effective in recommendation systems, in a limited user environment, there are not enough users that would make collaborative filtering effective. A general framework, called iScore, is presented for defining and measuring the ''interestingness'' of articles, incorporating user-feedback. iScore addresses the various aspects of what makes an article interesting, such as topic relevance, uniqueness, freshness, source reputation, and writing style. It employs various methods, such as multiple topic tracking, online parameter selection, language models, clustering, sentiment analysis, and phrase extraction to measure these features. Due to varying reasons that users hold about why an article is interesting, an online feature selection method in nai@?ve Bayes is also used to improve recommendation results. iScore can outperform traditional IR techniques by as much as 50.7%. iScore and its components are evaluated in the news recommendation task using three datasets from Yahoo! News, actual users, and Digg.