Machine learning in building a collection of computer science course syllabi

Authors:
Nakul Rathod;Lillian N. Cassel
Affiliations:
Department of Computing Sciences, Villanova University, Villanova, PA;Department of Computing Sciences, Villanova University, Villanova, PA
Venue:
TPDL'12 Proceedings of the Second international conference on Theory and Practice of Digital Libraries
Year:
2012

Citing 9
Cited 1

Modern Information Retrieval

Modern Information Retrieval
Induction of Decision Trees

Machine Learning
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Learning and Problem Solving with Multilayer Connectionist Systems

Learning and Problem Solving with Multilayer Connectionist Systems
Automatic Identification of Home Pages on the Web

HICSS '05 Proceedings of the Proceedings of the 38th Annual Hawaii International Conference on System Sciences (HICSS'05) - Track 4 - Volume 04
Some Effective Techniques for Naive Bayes Text Classification

IEEE Transactions on Knowledge and Data Engineering
Towards a syllabus repository for computer science courses

Proceedings of the 38th SIGCSE technical symposium on Computer science education
Natural Language Processing with Python

Natural Language Processing with Python

Building a search engine for computer science course syllabi

Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries

Quantified Score

Hi-index	0.00

Visualization

Abstract

Syllabi are rich educational resources. However, finding Computer Science syllabi on a generic search engine does not work well. Towards our goal of building a syllabus collection we have trained various Decision Tree, Naive-Bayes, Support Vector Machine and Feed-Forward Neural Network classifiers to recognize Computer Science syllabi from other web pages. We have also trained our classifiers to distinguish between Artificial Intelligence and Software Engineering syllabi. Our best classifiers are 95% accurate at both the tasks. We present an analysis of the various feature selection methods and classifiers we used hoping to help others developing their own collections.