Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Authoritative sources in a hyperlinked environment
Journal of the ACM (JACM)
Effective site finding using link anchor information
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Automating the Construction of Internet Portals with Machine Learning
Information Retrieval
Engineering a multi-purpose test collection for web retrieval experiments
Information Processing and Management: an International Journal
Representing interests as a hyperlinked document collection
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Combining link-based and content-based methods for web document classification
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Usefulness of hyperlink structure for query-biased topic distillation
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Intelligent GP fusion from multiple sources for text classification
Proceedings of the 14th ACM international conference on Information and knowledge management
A decision mechanism for the selective combination of evidence in topic distillation
Information Retrieval
A comparative study of citations and links in document classification
Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Latent linkage semantic kernels for collective classification of link data
Journal of Intelligent Information Systems
Learning Contextual Dependency Network Models for Link-Based Classification
IEEE Transactions on Knowledge and Data Engineering
Multi-evidence, multi-criteria, lazy associative document classification
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
A Voting Method for the Classification of Web Pages
WI-IATW '06 Proceedings of the 2006 IEEE/WIC/ACM international conference on Web Intelligence and Intelligent Agent Technology
Web page classification: Features and algorithms
ACM Computing Surveys (CSUR)
Classifying documents with link-based bibliometric measures
Information Retrieval
Combining file content and file relations for cloud based malware detection
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Importance of HTML structural elements and metadata in automated subject classification
ECDL'05 Proceedings of the 9th European conference on Research and Advanced Technology for Digital Libraries
Hi-index | 0.00 |
Link analysis methods have become popular for information access tasks, especially information retrieval, where the link information in a document collection is used to complement the traditionally used content information. However, there has been little firm evidence to confirm the utility of link information. We show that link information can be useful when the document collection has a sufficiently high link density and links are of sufficiently high quality. We report experiments on text classification of the Cora and WebKB data sets using Probabilistic Latent Semantic Analysis and Probabilistic Hypertext Induced Topic Selection. Comparison with manually assigned classes shows that link information enhances classification in data with sufficiently high link density, but is detrimental to performance at low link densities or if the quality of the links is degraded. We introduce a new frequency-based method for selecting the most useful citations from a document collection for use in the model.