Generating Fuzzy Equivalence Classes on RSS News Articles for Retrieving Correlated Information

Authors:
Nathaniel Gustafson;Maria Soledad Pera;Yiu-Kai Ng
Affiliations:
Computer Science Department, Brigham Young University, Provo, U.S.A.;Computer Science Department, Brigham Young University, Provo, U.S.A.;Computer Science Department, Brigham Young University, Provo, U.S.A.
Venue:
ICCSA '08 Proceedings of the international conference on Computational Science and Its Applications, Part II
Year:
2008

Citing 18
Cited 0

Fuzzy set theory—and its applications (3rd ed.)

Fuzzy set theory—and its applications (3rd ed.)
Fuzzy set theory: foundations and applications

Fuzzy set theory: foundations and applications
Syntactic clustering of the Web

Selected papers from the sixth international conference on World Wide Web
A study of retrospective and on-line event detection

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Evaluating contents-link coupled web page clustering for web search results

Proceedings of the eleventh international conference on Information and knowledge management
Topic Extraction from News Archive Using TF*PDF Algorithm

WISE '02 Proceedings of the 3rd International Conference on Web Information Systems Engineering
A repetition based measure for verification of text collections and for text categorization

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Document clustering based on non-negative matrix factorization

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Clustering binary data streams with K-means

DMKD '03 Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Artificial Intelligence: Structures and Strategies for Complex Problem Solving (5th Edition)

Artificial Intelligence: Structures and Strategies for Complex Problem Solving (5th Edition)
Document clustering by concept factorization

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Event threading within news topics

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Text document clustering based on frequent word sequences

Proceedings of the 14th ACM international conference on Information and knowledge management
Near-duplicate detection by instance-level constrained clustering

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Incremental hierarchical clustering of text documents

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
A divide-and-merge methodology for clustering

ACM Transactions on Database Systems (TODS)
A novel clustering-based RSS aggregator

Proceedings of the 16th international conference on World Wide Web
Similarity relations and fuzzy orderings

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Tens of thousands of news articles are posted on-line each day, covering topics from politics to science to current events. In order to better cope with this overwhelming volume of information, RSS (news) feeds are used to categorize newly posted articles. Nonetheless, most RSS users must filter through many articles within the same or different RSS feeds in order to locate articles pertaining to their particular interests. Due to the large number of news articles in individual RSS feeds, there is a need for further organizing articles to aid users in locating non-redundant, informative, and related articles of interest quickly. In this paper, we present a novel approach which uses the word-correlation factors in a fuzzy set information retrieval model to (i) filter out redundant news articles from RSS feeds, (ii) shed less-informative articles from the non-redundant ones, and (iii) cluster the remaining informative articles according to the fuzzy equivalence classes generated on the news articles. Our clustering approach requires little overhead or computational costs, and experimental results have shown that it outperforms other existing well-known clustering approaches.