The influence of the broadness of a query of a topic on its h-index: Models and examples of the h-index of n-grams

Authors:
Leo Egghe;I. K. Ravichandra Rao
Affiliations:
Universiteit Hasselt (UHasselt), Campus Diepenbeek, Agoralaan, B-3590 Diepenbeek, Belgium and Universiteit Antwerpen (UA), Campus Drie Eiken, Universiteitsplein 1, B-2610 Wilrijk, Belgium;Universiteit Hasselt (UHasselt), Campus Diepenbeek, Agoralaan, B-3590 Diepenbeek, Belgium and Indian Statistical Institute (ISI), 8th Mile, Mysore Road, R.V. College P.O., Bangalore-560059, India
Venue:
Journal of the American Society for Information Science and Technology
Year:
2008

Citing 1
Cited 1

Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval

Mathematical study of h-index sequences

Information Processing and Management: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

The article studies the influence of the query formulation of a topic on its h-index. In order to generate pure random sets of documents, we used N-grams (N variable) to measure this influence: strings of zeros, truncated at the end. The used databases are WoS and Scopus. The formula ${\rm{h = T}}^{{\textstyle{1 \over \alpha }}} $, proved in Egghe and Rousseau (2006) where T is the number of retrieved documents and α is Lotka's exponent, is confirmed being a concavely increasing function of T. We also give a formula for the relation between h and N the length of the N-gram: ${\rm{h = D10}}^{ - {\textstyle{{\rm{N}} \over \alpha }}} $ where D is a constant, a convexly decreasing function, which is found in our experiments. Nonlinear regression on ${\rm{h = T}}^{{\textstyle{1 \over \alpha }}} $ gives an estimation of α, which can then be used to estimate the h-index of the entire database (Web of Science [WoS] and Scopus): ${\rm{h = S}}^{{\textstyle{1 \over \alpha }}} $, where S is the total number of documents in the database. © 2008 Wiley Periodicals, Inc.