The influence of the broadness of a query of a topic on its h-index: Models and examples of the h-index of n-grams

  • Authors:
  • Leo Egghe;I. K. Ravichandra Rao

  • Affiliations:
  • Universiteit Hasselt (UHasselt), Campus Diepenbeek, Agoralaan, B-3590 Diepenbeek, Belgium and Universiteit Antwerpen (UA), Campus Drie Eiken, Universiteitsplein 1, B-2610 Wilrijk, Belgium;Universiteit Hasselt (UHasselt), Campus Diepenbeek, Agoralaan, B-3590 Diepenbeek, Belgium and Indian Statistical Institute (ISI), 8th Mile, Mysore Road, R.V. College P.O., Bangalore-560059, India

  • Venue:
  • Journal of the American Society for Information Science and Technology
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The article studies the influence of the query formulation of a topic on its h-index. In order to generate pure random sets of documents, we used N-grams (N variable) to measure this influence: strings of zeros, truncated at the end. The used databases are WoS and Scopus. The formula ${\rm{h = T}}^{{\textstyle{1 \over \alpha }}} $, proved in Egghe and Rousseau (2006) where T is the number of retrieved documents and α is Lotka's exponent, is confirmed being a concavely increasing function of T. We also give a formula for the relation between h and N the length of the N-gram: ${\rm{h = D10}}^{ - {\textstyle{{\rm{N}} \over \alpha }}} $ where D is a constant, a convexly decreasing function, which is found in our experiments. Nonlinear regression on ${\rm{h = T}}^{{\textstyle{1 \over \alpha }}} $ gives an estimation of α, which can then be used to estimate the h-index of the entire database (Web of Science [WoS] and Scopus): ${\rm{h = S}}^{{\textstyle{1 \over \alpha }}} $, where S is the total number of documents in the database. © 2008 Wiley Periodicals, Inc.