Automatic text processing: the transformation, analysis, and retrieval of information by computer
Automatic text processing: the transformation, analysis, and retrieval of information by computer
Integrating structured data and text: a relational approach
Journal of the American Society for Information Science
Improved algorithms for topic distillation in a hyperlinked environment
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Automatic resource compilation by analyzing hyperlink structure and associated text
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Authoritative sources in a hyperlinked environment
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
First 20 precision among World Wide Web search services (search engines)
Journal of the American Society for Information Science
GlOSS: text-source discovery over the Internet
ACM Transactions on Database Systems (TODS)
Relevance ranking for one to three term queries
Information Processing and Management: an International Journal
The stochastic approach for link-structure analysis (SALSA) and the TKC effect
Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
Modifications of Kleinberg's HITS algorithm using matrix exponentiation and web log records
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
On the Efficient Allocation of Resources for Hypothesis Evaluation: A Statistical Approach
IEEE Transactions on Pattern Analysis and Machine Intelligence
Learning to Create Customized Authority Lists
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Document Length Normalization
A new statistical method for performance evaluation of search engines
ICTAI '00 Proceedings of the 12th IEEE International Conference on Tools with Artificial Intelligence
Efficient heuristic hypothesis ranking
Journal of Artificial Intelligence Research
Refinement of TF-IDF schemes for web pages using their hyperlinked neighboring pages
Proceedings of the fourteenth ACM conference on Hypertext and hypermedia
The story picturing engine: finding elite images to illustrate a story using mutual reinforcement
Proceedings of the 6th ACM SIGMM international workshop on Multimedia information retrieval
Effect of different network analysis strategies on search engine re-ranking
CASCON '04 Proceedings of the 2004 conference of the Centre for Advanced Studies on Collaborative research
Learnable topic-specific web crawler
Journal of Network and Computer Applications - Special issue on computational intelligence on the internet
Identifying link farm spam pages
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Higher-Order Web Link Analysis Using Multilinear Algebra
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
The Story Picturing Engine---a system for automatic text illustration
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Site level noise removal for search engines
Proceedings of the 15th international conference on World Wide Web
Evolving local and global weighting schemes in information retrieval
Information Retrieval
Undue influence: eliminating the impact of link plagiarism on web search rankings
Proceedings of the 2006 ACM symposium on Applied computing
Core algorithms in the CLEVER system
ACM Transactions on Internet Technology (TOIT)
Measuring similarity to detect qualified links
AIRWeb '07 Proceedings of the 3rd international workshop on Adversarial information retrieval on the web
DirichletRank: Solving the zero-one gap problem of PageRank
ACM Transactions on Information Systems (TOIS)
Using Web Clustering for Web Communities Mining and Analysis
WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Improvements of HITS Algorithms for Spam Links
IEICE - Transactions on Information and Systems
HITS algorithm improvement using anchor-related text extracted by DOM structure analysis
Proceedings of the 2009 ACM symposium on Applied Computing
A generic construct based workload model for web search
Information Processing and Management: an International Journal
Topic-Based Computing Model for Web Page Popularity and Website Influence
AI '09 Proceedings of the 22nd Australasian Joint Conference on Advances in Artificial Intelligence
HITS algorithm improvement using semantic text portion
Web Intelligence and Agent Systems
Improvements of HITS algorithms for spam links
APWeb/WAIM'07 Proceedings of the joint 9th Asia-Pacific web and 8th international conference on web-age information management conference on Advances in data and web management
Structure vs. content in hierarchical corpora
Information Retrieval
Topic distillation in desktop search
DEXA'06 Proceedings of the 17th international conference on Database and Expert Systems Applications
Survey on web spam detection: principles and algorithms
ACM SIGKDD Explorations Newsletter
Review of bisonet abstraction techniques
Bisociative Knowledge Discovery
gTravel: a global social travel system
Proceedings of the 20th ACM international conference on Multimedia
WSTRank: ranking tags to facilitate web service mining
ICSOC'12 Proceedings of the 10th international conference on Service-Oriented Computing
Evaluation of the Reputation Network Using Realistic Distance between Facebook Data
WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 03
Preference-based mining of top-K influential nodes in social networks
Future Generation Computer Systems
PSG: a two-layer graph model for document summarization
Frontiers of Computer Science: Selected Publications from Chinese Universities
Web Intelligence and Agent Systems
Hi-index | 0.00 |
In this paper, we present two ways to improve the precision of HITS-based algorithms on Web documents. First, by analyzing the limitations of current HITS-based algorithms, we propose a new weighted HITS-based method that assigns appropriate weights to in-links of root documents. Then, we combine content analysis with HITS-based algorithms and study the effects of four representative relevance scoring methods, VSM, Okapi, TLS, and CDR, using a set of broad topic queries. Our experimental results show that our weighted HITS-based method performs significantly better than Bharat's improved HITS algorithm. When we combine our weighted HITS-based method or Bharat's HITS algorithm with any of the four relevance scoring methods, the combined methods are only marginally better than our weighted HITS-based method. Between the four relevance-scoring methods, there is no significant quality difference when they are combined with a HITS-based algorithm.