Identifying training sets for personalized article retrieval system

Authors:
Cen Li;Sachintha Pitigala;Suk J. Seo
Affiliations:
Middle Tennessee State University, Murfreesboro, TN;Middle Tennessee State University, Murfreesboro, TN;Middle Tennessee State University, Murfreesboro, TN
Venue:
Proceedings of the 49th Annual Southeast Regional Conference
Year:
2011

Citing 5
Cited 0

Making large-scale support vector machine learning practical

Advances in kernel methods
A Tutorial on Support Vector Machines for Pattern Recognition

Data Mining and Knowledge Discovery
An empirical comparison of supervised learning algorithms

ICML '06 Proceedings of the 23rd international conference on Machine learning
Introduction to Information Retrieval

Introduction to Information Retrieval
Hunting for truly relevant articles in bioinformatics literature: a preliminary study

Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology

Quantified Score

Hi-index	0.02

Visualization

Abstract

Retrieving documents that are relevant to a particular researcher's purpose is a big challenge, especially when searching through large database, such as PubMed. Researchers who use traditional keyword-based document retrieval systems often end up with a large collection of documents that are not directly relevant to their needs. What is needed is a personalized document retrieval system that can select only relevant articles for one's specific research interests. Obtaining an appropriate training data set is essential in building and testing personalized article retrieval systems. This study describes one approach to form such training data set based on articles categorized by domain experts under MeSH major topics. Text classifiers, learned using Support Vector Machines, were used to test to what degree the training set categories are differentiable. Preliminary results and analysis of the results are discussed.