Copy detection mechanisms for digital documents
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Extracting schema from semistructured data
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Trawling the Web for emerging cyber-communities
WWW '99 Proceedings of the eighth international conference on World Wide Web
Authoritative sources in a hyperlinked environment
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Analysis of a very large web search engine query log
ACM SIGIR Forum
Finding replicated Web collections
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
A comparison of techniques to find mirrored hosts on the WWW
Journal of the American Society for Information Science
Proceedings of the 10th international conference on World Wide Web
Enhanced topic distillation using text, markup tags, and hyperlinks
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Effective site finding using link anchor information
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
CHI '01 Extended Abstracts on Human Factors in Computing Systems
On the Resemblance and Containment of Documents
SEQUENCES '97 Proceedings of the Compression and Complexity of Sequences 1997
The connectivity sonar: detecting site functionality by structural patterns
Proceedings of the fourteenth ACM conference on Hypertext and hypermedia
Enhancing reputation mechanisms via online social networks
EC '04 Proceedings of the 5th ACM conference on Electronic commerce
Web Searching and Information Retrieval
Computing in Science and Engineering
Spam, damn spam, and statistics: using statistical analysis to locate spam web pages
Proceedings of the 7th International Workshop on the Web and Databases: colocated with ACM SIGMOD/PODS 2004
Calculating web page trustworthiness by exploring communities on the web
Journal of Computing Sciences in Colleges
Detecting phrase-level duplication on the world wide web
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
What do we know about links and linking? A framework for studying links in academic environments
Information Processing and Management: an International Journal
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Users want more sophisticated search assistants: results of a task-based evaluation
Journal of the American Society for Information Science and Technology
Topical TrustRank: using topicality to combat web spam
Proceedings of the 15th international conference on World Wide Web
Detecting spam web pages through content analysis
Proceedings of the 15th international conference on World Wide Web
Detecting semantic cloaking on the web
Proceedings of the 15th international conference on World Wide Web
Undue influence: eliminating the impact of link plagiarism on web search rankings
Proceedings of the 2006 ACM symposium on Applied computing
Implementation and evaluation of a quality-based search engine
Proceedings of the seventeenth conference on Hypertext and hypermedia
Link spam detection based on mass estimation
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Improving web spam classifiers using link structure
AIRWeb '07 Proceedings of the 3rd international workshop on Adversarial information retrieval on the web
Using spam farm to boost PageRank
AIRWeb '07 Proceedings of the 3rd international workshop on Adversarial information retrieval on the web
Web spam detection via commercial intent analysis
AIRWeb '07 Proceedings of the 3rd international workshop on Adversarial information retrieval on the web
Know your neighbors: web spam detection using the web topology
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Foundations and Trends in Web Science
The Viúva Negra crawler: an experience report
Software—Practice & Experience
Evaluating WordBars in exploratory Web search scenarios
Information Processing and Management: an International Journal
DistanceRank: An intelligent ranking algorithm for web pages
Information Processing and Management: an International Journal
Symbolic links in the Open Directory Project
Information Processing and Management: an International Journal
Fourth international workshop on adversarial information retrieval on the web (AIRWeb 2008)
Proceedings of the 17th international conference on World Wide Web
Latent dirichlet allocation in web spam filtering
AIRWeb '08 Proceedings of the 4th international workshop on Adversarial information retrieval on the web
Understanding help seeking within the context of searching digital libraries
Journal of the American Society for Information Science and Technology
Web spam filtering in internet archives
Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web
Linked latent Dirichlet allocation in web spam filtering
Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web
Nullification test collections for web spam and SEO
Proceedings of the 5th International Workshop on Adversarial Information Retrieval on the Web
Ranking billions of web pages using diodes
Communications of the ACM - A Blind Person's Interaction with Technology
Approximating true relevance distribution from a mixture model based on irrelevance data
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Challenges in web search engines
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Web Observation from a User Perspective
WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Exploiting bidirectional links: making spamming detection easier
Proceedings of the 18th ACM conference on Information and knowledge management
Automatic seed set expansion for trust propagation based anti-spamming algorithms
Proceedings of the eleventh international workshop on Web information and data management
Security challenges for reputation mechanisms using online social networks
Proceedings of the 2nd ACM workshop on Security and artificial intelligence
Development of a large-scale web crawler and search engine infrastructure
Proceedings of the 3rd International Universal Communication Symposium
Foundations and Trends in Information Retrieval
Improving spamdexing detection via a two-stage classification strategy
AIRS'08 Proceedings of the 4th Asia information retrieval conference on Information retrieval technology
Mining Query Logs: Turning Search Usage Data into Knowledge
Foundations and Trends in Information Retrieval
Information Systems Research
Temporal query log profiling to improve web search ranking
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Spam detection with a content-based random-walk algorithm
SMUC '10 Proceedings of the 2nd international workshop on Search and mining user-generated contents
Social Network Analysis and Mining for Business Applications
ACM Transactions on Intelligent Systems and Technology (TIST)
Web spam classification: a few features worth more
Proceedings of the 2011 Joint WICOW/AIRWeb Workshop on Web Quality
Proceedings of the 12th International Conference on Information Integration and Web-based Applications & Services
Foundations and Trends in Information Retrieval
Tackling content spamming with a term weighting scheme
Proceedings of the 2011 Joint Workshop on Multilingual OCR and Analytics for Noisy Unstructured Text Data
Content-based trust and bias classification via biclustering
Proceedings of the 2nd Joint WICOW/AIRWeb Workshop on Web Quality
Survey on web spam detection: principles and algorithms
ACM SIGKDD Explorations Newsletter
Adversarial content manipulation effects
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Analysis and detection of web spam by means of web content
IRFC'12 Proceedings of the 5th conference on Multidisciplinary Information Retrieval
Constructing a reliable Web graph with information on browsing behavior
Decision Support Systems
Automatic seed set expansion for trust propagation based anti-spam algorithms
Information Sciences: an International Journal
Cross-lingual web spam classification
Proceedings of the 22nd international conference on World Wide Web companion
Combating Web spam through trust-distrust propagation with confidence
Pattern Recognition Letters
SAAD, a content based Web Spam Analyzer and Detector
Journal of Systems and Software
Hi-index | 0.00 |
This article presents a high-level discussion of some problems in information retrieval that are unique to web search engines. The goal is to raise awareness and stimulate research in these areas.