Automatic syllabus classification
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Web Communities Defined by Web Page Content
WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 03
Learning to recognize webpage genres
Information Processing and Management: an International Journal
Web Site Description Based on Genres and Web Design Patterns
SOCINFO '09 Proceedings of the 2009 International Workshop on Social Informatics
Efficient name disambiguation in digital libraries
WAIM'11 Proceedings of the 12th international conference on Web-age information management
Enhance web pages genre identification using neighboring pages
WISE'11 Proceedings of the 12th international conference on Web information system engineering
Automatic genre identification: towards a flexible classification scheme
FDIA'07 Proceedings of the 1st BCS IRSG conference on Future Directions in Information Access
Machine learning in building a collection of computer science course syllabi
TPDL'12 Proceedings of the Second international conference on Theory and Practice of Digital Libraries
Building a search engine for computer science course syllabi
Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries
Hi-index | 0.01 |
The research reported in this paper is the first phase of a larger project on the automatic classification of Web pages by their genres. The long term goal is the incorporation of web page genre into the search process to improve the quality of the search results. In this phase, a neural net classifier was trained to distinguish home pages from non-home pages and to classify those home pages as personal home page, corporate home page or organization home page. Results indicate that the classifier is able to distinguish home pages from non-home pages and within the home page genre it is able to distinguish personal from corporate home pages. Organization home pages, however, were more difficult to distinguish from personal and corporate home pages.