Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach
Data Mining and Knowledge Discovery
Evaluating adaptive user profiles for news classification
Proceedings of the 9th international conference on Intelligent user interfaces
Efficient Algorithms for Mining Closed Itemsets and Their Lattice Structure
IEEE Transactions on Knowledge and Data Engineering
Towards practical genre classification of web documents
Proceedings of the 15th international conference on World Wide Web
Linguistic correlates of style: authorship classification with deep linguistic analysis features
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
A Novel Document Analysis Method Using Compressibility Vector
ISDPE '07 Proceedings of the The First International Symposium on Data, Privacy, and E-Commerce
Discovering relationships among categories using misclassification information
Proceedings of the 2008 ACM symposium on Applied computing
Storyline-based summarization for news topic retrospection
Decision Support Systems
Comparing Rule Measures for Predictive Association Rules
ECML '07 Proceedings of the 18th European conference on Machine Learning
Text Document Clustering Based on the Modifying Relations
CSSE '08 Proceedings of the 2008 International Conference on Computer Science and Software Engineering - Volume 01
Quality Evaluation for Document Relation Discovery Using Citation Information
IEICE - Transactions on Information and Systems
Relation Discovery from Thai News Articles Using Association Rule Mining
PAISI '09 Proceedings of the Pacific Asia Workshop on Intelligence and Security Informatics
Personalized news categorization through scalable text classification
APWeb'06 Proceedings of the 8th Asia-Pacific Web conference on Frontiers of WWW Research and Development
Inclusion-based and exclusion-based approaches in graph-based multiple news summarization
KICSS'10 Proceedings of the 5th international conference on Knowledge, information, and creativity support systems
Hi-index | 0.00 |
Tracking and relating news articles from several sources can play against misinformation from deceptive news stories since single source can not judge whether the information is a truth or not. Preventing misinformation in a computer system is an interesting research in intelligence and security informatics. For this task, association rule mining has been recently applied due to its performance and scalability. This paper presents an exploration on how term representation basis, term weighting and association measure affect the quality of relations discovered among news articles from several sources. Twenty four combinations initiated by two term representation bases, four term weightings, and three association measures are explored with their results compared to human judgement. A number of evaluations are conducted to compare each combination’s performance to the others’ with regard to top-k ranks. The experimental results indicate that a combination of bigram (BG), term frequency with inverse document frequency (TFIDF) and confidence (CONF), as well as a combination of BG, TFIDF and conviction (CONV), achieves the best performance to find the related documents by placing them in upper ranks with 0.41% rank-order mismatch on top-50 mined relations. However, a combination of unigram (UG), TFIDF and lift (LIFT) performs the best by locating irrelevant relations in lower ranks (top-1100) with rank-order mismatch of 9.63 %.