Information extraction from syllabi for academic e-Advising

Authors:
Yevgen Biletskiy;J. Anthony Brown;Girish Ranganathan
Affiliations:
University of New Brunswick, UNB, Electrical and Computer Engineering, 15 Dineen Drive, Fredericton, New Brunswick, Canada E3B5A3;University of New Brunswick, UNB, Electrical and Computer Engineering, 15 Dineen Drive, Fredericton, New Brunswick, Canada E3B5A3;University of New Brunswick, UNB, Electrical and Computer Engineering, 15 Dineen Drive, Fredericton, New Brunswick, Canada E3B5A3
Venue:
Expert Systems with Applications: An International Journal
Year:
2009

Citing 13
Cited 3

A flexible learning system for wrapping tables and lists in HTML documents

Proceedings of the 11th international conference on World Wide Web
DOM-based content extraction of HTML documents

WWW '03 Proceedings of the 12th international conference on World Wide Web
Mining data records in Web pages

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Extracting relational data from HTML repositories

ACM SIGKDD Explorations Newsletter
Mining knowledge from text using information extraction

ACM SIGKDD Explorations Newsletter - Natural language processing and text mining
Using text mining and natural language processing for health care claims processing

ACM SIGKDD Explorations Newsletter - Natural language processing and text mining
Learning Object Models from Semistructured Web Documents

IEEE Transactions on Knowledge and Data Engineering
Building ontologies for interoperability among learning objects and learners

IEA/AIE'2004 Proceedings of the 17th international conference on Innovations in applied artificial intelligence
An extensible text extraction tool for learning objects

ICEC '06 Proceedings of the 8th international conference on Electronic commerce: The new e-commerce: innovations for conquering current barriers, obstacles and limitations to conducting successful business on the internet
An adaptive scheduling system with genetic algorithms for arranging employee training programs

Expert Systems with Applications: An International Journal
The use of ontologies and rules to assist in academic advising

RuleML'07 Proceedings of the 2007 international conference on Advances in rule interchange and applications
Toward agency and ontology for web-based information retrieval

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Discovering golden nuggets: data mining in financial application

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews

A decision-tree-based system for student academic advising and planning in information systems programmes

International Journal of Business Information Systems
Enabling successful Collaboration 2.0: A REST-based Web Service and Web 2.0 technology oriented information platform for collaborative product development

Computers in Industry
Matching semi-structured documents using similarity of regions through fuzzy rule-based system

ICDM'13 Proceedings of the 13th international conference on Advances in Data Mining: applications and theoretical aspects

Quantified Score

Hi-index	12.05

Visualization

Abstract

Creating an academic e-Advisor to automate the process of transferring course credits between institutions and recommend courses for further study requires an extensive database of course information. This paper presents an application for creating such a database by automatically extracting relevant information from HTML course outlines stored on an institution's website and storing it in machine-readable XML. The developed application, called CODE (course outline data extractor), parses a course outline based on its HTML tags and content to build a document object model then applies a combination of web mining, natural language processing, and pattern recognition techniques to automatically classify and extract content useful for the semi-automatic e-Advisor and store it as XML. The current implementation is restricted to HTML course outlines, but the concepts can be extended to other formats of learning objects or entirely different domains. The quality of extraction and classification is evaluated for a corpus of syllabi as proof of concept.