Modelling citation networks for improving scientific paper classification performance

Authors:
Mengjie Zhang;Xiaoying Gao;Minh Duc Cao;Yuejin Ma
Affiliations:
School of Mathematics, Statistics and Computer Science, Victoria Univ. of Wellington, Wellington, New Zealand and Artificial Intelligence Research Centre, College of Mechanical and Electrical Eng. ...;School of Mathematics, Statistics and Computer Science, Victoria Univ. of Wellington, Wellington, New Zealand and Artificial Intelligence Research Centre, College of Mechanical and Electrical Eng. ...;School of Mathematics, Statistics and Computer Science, Victoria University of Wellington, Wellington, New Zealand;Artificial Intelligence Research Centre, College of Mechanical and Electrical Engineering, Agricultural University of Hebei, Baoding, China
Venue:
PRICAI'06 Proceedings of the 9th Pacific Rim international conference on Artificial intelligence
Year:
2006

Citing 15
Cited 1

Probabilistic reasoning in intelligent systems: networks of plausible inference

Probabilistic reasoning in intelligent systems: networks of plausible inference
Expert network: effective and efficient learning from human decisions in text categorization and retrieval

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Enhanced hypertext categorization using hyperlinks

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Introduction to Monte Carlo methods

Learning in graphical models
Hierarchical neural networks for text categorization (poster abstract)

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A practical hypertext catergorization method using links and incrementally available class information

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Relational learning with statistical predicate invention: better models for hypertext

Machine Learning - Special issue on inducive logic programming
Automating the Construction of Internet Portals with Machine Learning

Information Retrieval
Learning Logical Definitions from Relations

Machine Learning
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Probabilistic classification and clustering in relational data

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Artificial Intelligence: A Modern Approach

Artificial Intelligence: A Modern Approach
Combining contents and citations for scientific document classification

AI'05 Proceedings of the 18th Australian Joint conference on Advances in Artificial Intelligence

Phoneme Based Representation for Vietnamese Web Page Classification

WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes an approach to the use of citation links to improve the scientific paper classification performance. In this approach, we develop two refinement functions, a linear label refinement (LLR) and a probabilistic label refinement (PLR), to model the citation link structures of the scientific papers for refining the class labels of the documents obtained by the content-based Naive Bayes classification method. The approach with the two new refinement models is examined and compared with the content-based Naive Bayes method on a standard paper classification data set with increasing training set sizes. The results suggest that both refinement models can significantly improve the system performance over the content-based method for all the training set sizes and that PLR is better than LLR when the training examples are sufficient.