Fab: content-based, collaborative recommendation
Communications of the ACM
Proceedings of the ACM SIGCHI Conference on Human factors in computing systems
Life, death, and lawfulness on the electronic frontier
Proceedings of the ACM SIGCHI Conference on Human factors in computing systems
An algorithm for suffix stripping
Readings in information retrieval
Enhanced hypertext categorization using hyperlinks
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Inferring Web communities from link topology
Proceedings of the ninth ACM conference on Hypertext and hypermedia : links, objects, time and space---structure in hypermedia systems: links, objects, time and space---structure in hypermedia systems
Improved algorithms for topic distillation in a hyperlinked environment
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Automatic resource compilation by analyzing hyperlink structure and associated text
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Inquirus, the NECI meta search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Efficient crawling through URL ordering
WWW7 Proceedings of the seventh international conference on World Wide Web 7
A technique for measuring the relative size and overlap of public Web search engines
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Grouper: a dynamic clustering interface to Web search results
WWW '99 Proceedings of the eighth international conference on World Wide Web
Finding related pages in the World Wide Web
WWW '99 Proceedings of the eighth international conference on World Wide Web
Focused crawling: a new approach to topic-specific Web resource discovery
WWW '99 Proceedings of the eighth international conference on World Wide Web
Adding support for dynamic and focused search with Fetuccino
WWW '99 Proceedings of the eighth international conference on World Wide Web
Authoritative sources in a hyperlinked environment
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Adaptive Retrieval Agents: Internalizing Local Contextand Scaling up to the Web
Machine Learning - Special issue on information retrieval
Effective site finding using link anchor information
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Accelerated focused crawling through online relevance feedback
Proceedings of the 11th international conference on World Wide Web
The structure of broad topics on the web
Proceedings of the 11th international conference on World Wide Web
Evaluating strategies for similarity search on the web
Proceedings of the 11th international conference on World Wide Web
Predicting web actions from HTML content
Proceedings of the thirteenth ACM conference on Hypertext and hypermedia
Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries
The Importance of Prior Probabilities for Entry Page Search
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Topic-oriented collaborative crawling
Proceedings of the eleventh international conference on Information and knowledge management
Text Retrieval Systems for the Web
Programming and Computing Software
Focused Crawls, Tunneling, and Digital Libraries
ECDL '02 Proceedings of the 6th European Conference on Research and Advanced Technology for Digital Libraries
Proceedings of the 24th BCS-IRSG European Colloquium on IR Research: Advances in Information Retrieval
Building a web thesaurus from web link structure
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Deriving link-context from HTML tag tree
DMKD '03 Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
The connectivity sonar: detecting site functionality by structural patterns
Proceedings of the fourteenth ACM conference on Hypertext and hypermedia
Refinement of TF-IDF schemes for web pages using their hyperlinked neighboring pages
Proceedings of the fourteenth ACM conference on Hypertext and hypermedia
Enhanced web document summarization using hyperlinks
Proceedings of the fourteenth ACM conference on Hypertext and hypermedia
Automatic topics discovery from hyperlinked documents
Information Processing and Management: an International Journal
PageCluster: Mining conceptual link hierarchies from Web log files for adaptive Web site navigation
ACM Transactions on Internet Technology (TOIT)
How valuable is external link evidence when searching enterprise Webs?
ADC '04 Proceedings of the 15th Australasian database conference - Volume 27
Robust document image understanding technologies
Proceedings of the 1st ACM workshop on Hardcopy document processing
Combining evidence for Web retrieval using the inference network model: an experimental study
Information Processing and Management: an International Journal - Special issue: Bayesian networks and information retrieval
A General Evaluation Framework for Topical Crawlers
Information Retrieval
LSH forest: self-tuning indexes for similarity search
WWW '05 Proceedings of the 14th international conference on World Wide Web
Toward a basic framework for webometrics
Journal of the American Society for Information Science and Technology - Special issue: Webometrics
Lexical and semantic clustering by web links
Journal of the American Society for Information Science and Technology - Special issue: Webometrics
Mapping the Semantics of Web Text and Links
IEEE Internet Computing
Learning to crawl: Comparing classification schemes
ACM Transactions on Information Systems (TOIS)
Focused crawling for both topical relevance and quality of medical information
Proceedings of the 14th ACM international conference on Information and knowledge management
Inferring document similarity from hyperlinks
Proceedings of the 14th ACM international conference on Information and knowledge management
Link Contexts in Classifier-Guided Topical Crawlers
IEEE Transactions on Knowledge and Data Engineering
Automatically Generating an E-textbook on the Web
World Wide Web
Topical TrustRank: using topicality to combat web spam
Proceedings of the 15th international conference on World Wide Web
Geographically focused collaborative crawling
Proceedings of the 15th international conference on World Wide Web
Topic-specific crawling on the web with the measurements of the relevancy context graph
Information Systems - Special issue: The semantic web and web services
Generalizing PageRank: damping functions for link-based ranking algorithms
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Focused crawling guided by link context
AIA'06 Proceedings of the 24th IASTED international conference on Artificial intelligence and applications
Creating a test collection for citation-based IR experiments
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Improving web spam classifiers using link structure
AIRWeb '07 Proceedings of the 3rd international workshop on Adversarial information retrieval on the web
Focused crawling with scalable ordinal regression solvers
Proceedings of the 24th international conference on Machine learning
Know your neighbors: web spam detection using the web topology
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Link analysis for Web spam detection
ACM Transactions on the Web (TWEB)
Classifiers without borders: incorporating fielded text from neighboring web pages
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Ant Focused Crawling Algorithm
ICAISC '08 Proceedings of the 9th international conference on Artificial Intelligence and Soft Computing
CRAWLING THE CONSTRUCTION WEB-A MACHINE-LEARNING APPROACH WITHOUT NEGATIVE EXAMPLES
Applied Artificial Intelligence
Web spam identification through content and hyperlinks
AIRWeb '08 Proceedings of the 4th international workshop on Adversarial information retrieval on the web
Can all tags be used for search?
Proceedings of the 17th ACM conference on Information and knowledge management
Association thesaurus construction methods based on link co-occurrence analysis for wikipedia
Proceedings of the 17th ACM conference on Information and knowledge management
Dr. Searcher and Mr. Browser: a unified hyperlink-click graph
Proceedings of the 17th ACM conference on Information and knowledge management
Web page classification: Features and algorithms
ACM Computing Surveys (CSUR)
A cross-language focused crawling algorithm based on multiple relevance prediction strategies
Computers & Mathematics with Applications
A framework to derive web page context from hyperlink structure
International Journal of Information and Communication Technology
Web site topic-hierarchy generation based on link structure
Journal of the American Society for Information Science and Technology
HITS algorithm improvement using anchor-related text extracted by DOM structure analysis
Proceedings of the 2009 ACM symposium on Applied Computing
Correlation of Term Count and Document Frequency for Google N-Grams
ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Refining search results using a mining framework
Expert Systems with Applications: An International Journal
Ontology-Based Service Discovery Front-End Interface for GloServ
ESWC 2009 Heraklion Proceedings of the 6th European Semantic Web Conference on The Semantic Web: Research and Applications
Computational linkuistics: word triggers across hyperlinks
HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
Web page clustering using heuristic search in the web graph
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
How to find better index terms through citations
CLIIR '06 Proceedings of the Workshop on How Can Computational Linguistics Improve Information Retrieval?
Adaptive geospatially focused crawling
Proceedings of the 18th ACM conference on Information and knowledge management
Anchor text extraction for academic search
NLPIR4DL '09 Proceedings of the 2009 Workshop on Text and Citation Analysis for Scholarly Digital Libraries
HITS algorithm improvement using semantic text portion
Web Intelligence and Agent Systems
Extraction of anchor-related text and its evaluation by user studies
Proceedings of the 2007 conference on Human interface: Part I
The adaptive web
Efficiently detecting webpage updates using samples
ICWE'07 Proceedings of the 7th international conference on Web engineering
Clustering based on random graph model embedding vertex features
Pattern Recognition Letters
Wikipedia mining for an association web thesaurus construction
WISE'07 Proceedings of the 8th international conference on Web information systems engineering
Connectivity of the Thai web graph
APWeb'08 Proceedings of the 10th Asia-Pacific web conference on Progress in WWW research and development
Using terms from citations for IR: some first results
ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Co-citations as citation endorsements and co-links as link endorsements
Journal of Information Science
Agents, bookmarks and clicks: a topical model of web navigation
Proceedings of the 21st ACM conference on Hypertext and hypermedia
Mining the web with hierarchical crawlers – a resource sharing based crawling approach
International Journal of Intelligent Information and Database Systems
Evaluating methods to rediscover missing web pages from the web infrastructure
Proceedings of the 10th annual joint conference on Digital libraries
Image classification using the web graph
Proceedings of the international conference on Multimedia
Discriminative graphical models for faculty homepage discovery
Information Retrieval
Creating a test collection: relevance judgements of cited & non-cited papers
Large Scale Semantic Access to Content (Text, Image, Video, and Sound)
Design and implementation of contextual information portals
Proceedings of the 20th international conference companion on World wide web
Are semantically related links more effective for retrieval?
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Online active inference and learning
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Detection of web communities from community cores
WISS'10 Proceedings of the 2010 international conference on Web information systems engineering
Improving video classification via youtube video co-watch data
SBNMA '11 Proceedings of the 2011 ACM workshop on Social and behavioural networked media access
Topic-independent web high-quality page selection based on k-means clustering
AIRS'05 Proceedings of the Second Asia conference on Asia Information Retrieval Technology
Intelligent search on the internet
Reasoning, Action and Interaction in AI Theories and Systems
An incremental approach to link evaluation in topic-driven web resource discovery
AAIM'05 Proceedings of the First international conference on Algorithmic Applications in Management
Survey on web spam detection: principles and algorithms
ACM SIGKDD Explorations Newsletter
Using anchor text for homepage and topic distillation search tasks
Journal of the American Society for Information Science and Technology
PROBABILISTIC MODELS FOR FOCUSED WEB CRAWLING
Computational Intelligence
Sentimental Spidering: Leveraging Opinion Information in Focused Crawlers
ACM Transactions on Information Systems (TOIS)
Towards automatic assessment of government web sites
Proceedings of the 3rd International Conference on Web Intelligence, Mining and Semantics
Determining the titles of Web pages using anchor text and link analysis
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
Most web pages are linked to others with related content. This idea, combined with another that says that text in, and possibly around, HTML anchors describe the pages to which they point, is the foundation for a usable World-Wide Web. In this paper, we examine to what extent these ideas hold by empirically testing whether topical locality mirrors spatial locality of pages on the Web. In particular, we find that the likelihood of linked pages having similar textual content to be high; the similarity of sibling pages increases when the links from the parent are close together; titles, descriptions, and anchor text represent at least part of the target page; and that anchor text may be a useful discriminator among unseen child pages. These results show the foundations necessary for the success of many web systems, including search engines, focused crawlers, linkage analyzers, and intelligent web agents.