Combining link and content analysis to estimate semantic similarity

Authors:
Filippo Menczer
Affiliations:
Indiana University, Bloomington, IN
Venue:
Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters
Year:
2004

Citing 4
Cited 9

Improved algorithms for topic distillation in a hyperlinked environment

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Automatic resource compilation by analyzing hyperlink structure and associated text

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Mining the Web: Discovering Knowledge from HyperText Data

Mining the Web: Discovering Knowledge from HyperText Data
Lexical and semantic clustering by web links

Journal of the American Society for Information Science and Technology - Special issue: Webometrics

Algorithmic detection of semantic similarity

WWW '05 Proceedings of the 14th international conference on World Wide Web
People search: Searching people sharing similar interests from the Web

Journal of the American Society for Information Science and Technology
Personalized ontology for web search personalization

COMPUTE '08 Proceedings of the 1st Bangalore Annual Compute Conference
Adaptive combination of tag and link-based user similarity in flickr

Proceedings of the international conference on Multimedia
Detecting hot events from web search logs

WAIM'10 Proceedings of the 11th international conference on Web-age information management
A Chinese web page automatic classification system

WISM'10 Proceedings of the 2010 international conference on Web information systems and mining
Evaluating semantic similarity using GML in geographic information systems

OTM'05 Proceedings of the 2005 OTM Confederated international conference on On the Move to Meaningful Internet Systems
Query approximation by semantic similarity in GeoPQL

OTM'06 Proceedings of the 2006 international conference on On the Move to Meaningful Internet Systems: AWeSOMe, CAMS, COMINF, IS, KSinBIT, MIOS-CIAO, MONET - Volume Part II
On exploiting content and citations together to compute similarity of scientific papers

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Search engines use content and link information to crawl, index, retrieve, and rank Web pages. The correlations between similarity measures based on these cues and on semantic associations between pages therefore crucially affects the performance of any search tool. Here I begin to quantitatively analyze the relationship between content, link, and semantic similarity measures across a massive number of Web page pairs. Maps of semantic similarity across textual and link similarity highlight the potential and limitations of lexical and link analysis for relevance approximation, and provide us with a way to study whether and how text and link based measures should be combined.