Enhanced hypertext categorization using hyperlinks
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Inferring Web communities from link topology
Proceedings of the ninth ACM conference on Hypertext and hypermedia : links, objects, time and space---structure in hypermedia systems: links, objects, time and space---structure in hypermedia systems
Improved algorithms for topic distillation in a hyperlinked environment
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Automatic resource compilation by analyzing hyperlink structure and associated text
WWW7 Proceedings of the seventh international conference on World Wide Web 7
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Finding related pages in the World Wide Web
WWW '99 Proceedings of the eighth international conference on World Wide Web
Trawling the Web for emerging cyber-communities
WWW '99 Proceedings of the eighth international conference on World Wide Web
Focused crawling: a new approach to topic-specific Web resource discovery
WWW '99 Proceedings of the eighth international conference on World Wide Web
Adding support for dynamic and focused search with Fetuccino
WWW '99 Proceedings of the eighth international conference on World Wide Web
Authoritative sources in a hyperlinked environment
Journal of the ACM (JACM)
Accessibility of information on the Web
intelligence
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Efficient identification of Web communities
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Adaptive Retrieval Agents: Internalizing Local Contextand Scaling up to the Web
Machine Learning - Special issue on information retrieval
ACM Transactions on Internet Technology (TOIT)
Evaluating topic-driven web crawlers
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Accelerated focused crawling through online relevance feedback
Proceedings of the 11th international conference on World Wide Web
The structure of broad topics on the web
Proceedings of the 11th international conference on World Wide Web
Information Retrieval
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
MySpiders: Evolve Your Own Intelligent Web Crawlers
Autonomous Agents and Multi-Agent Systems
Keeping Up with the Changing Web
Computer
ARCCHNID: Adaptive Retrieval Agents Choosing Heuristic Neighborhoods
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
FOCS '01 Proceedings of the 42nd IEEE symposium on Foundations of Computer Science
Topical web crawlers: Evaluating adaptive algorithms
ACM Transactions on Internet Technology (TOIT)
Discovery of ads web hosts through traffic data analysis
Proceedings of the 9th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Combining link and content analysis to estimate semantic similarity
Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters
Topical web crawlers: Evaluating adaptive algorithms
ACM Transactions on Internet Technology (TOIT)
A General Evaluation Framework for Topical Crawlers
Information Retrieval
Mapping the Semantics of Web Text and Links
IEEE Internet Computing
Generalizing PageRank: damping functions for link-based ranking algorithms
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Using similarity links as shortcuts to relevant web pages
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
WI-IATW '07 Proceedings of the 2007 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Workshops
Taxonomy alignment for interoperability between heterogeneous virtual organizations
Expert Systems with Applications: An International Journal
A comparative evaluation of different link types on enhancing document clustering
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
CRAWLING THE CONSTRUCTION WEB-A MACHINE-LEARNING APPROACH WITHOUT NEGATIVE EXAMPLES
Applied Artificial Intelligence
Extracting Topic Maps from Web Pages
New Frontiers in Applied Data Mining
A framework to derive web page context from hyperlink structure
International Journal of Information and Communication Technology
Semantic business process integration based on ontology alignment
Expert Systems with Applications: An International Journal
Folks in Folksonomies: social link prediction from shared metadata
Proceedings of the third ACM international conference on Web search and data mining
On compressing the textual web
Proceedings of the third ACM international conference on Web search and data mining
The adaptive web
Estimating node similarity from co-citation in a spatial graph model
Proceedings of the 2010 ACM Symposium on Applied Computing
A spatial web graph model with local influence regions
WAW'07 Proceedings of the 5th international conference on Algorithms and models for the web-graph
Costco: robust content and structure constrained clustering of networked documents
CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part II
CoLIS'05 Proceedings of the 5th international conference on Context: conceptions of Library and Information Sciences
Taxonomy alignment for interoperability between heterogeneous digital libraries
ICADL'06 Proceedings of the 9th international conference on Asian Digital Libraries: achievements, Challenges and Opportunities
Leveraging network structure for incremental document clustering
APWeb'12 Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications
The Evolution of the HIV/AIDS NGO Hyperlink Network
Journal of Computer-Mediated Communication
Studying the clustering paradox and scalability of search in highly distributed environments
ACM Transactions on Information Systems (TOIS)
Hi-index | 0.00 |
Recent Web-searching and -mining tools are combining text and link analysis to improve ranking and crawling algorithms. The central assumption behind such approaches is that there is a correlation between the graph structure of the Web and the text and meaning of pages. Here I formalize and empirically evaluate two general conjectures drawing connections from link information to lexical and semantic Web content. The link-content conjecture states that a page is similar to the pages that link to it, and the link-cluster conjecture that pages about the same topic are clustered together. These conjectures are often simply assumed to hold, and Web search tools are built on such assumptions. The present quantitative confirmation sheds light on the connection between the success of the latest Web-mining techniques and the small world topology of the Web, with encouraging implications for the design of better crawling algorithms.