Using web information for creating publication venue authority files

  • Authors:
  • Denilson Alves Pereira;Berthier Ribeiro-Neto;Nivio Ziviani;Alberto H. F. Laender

  • Affiliations:
  • Federal University of Minas Gerais, Belo Horizonte, Brazil;Federal University of Minas Gerais and Google Engineering, Belo Horizonte, Brazil;Federal University of Minas Gerais, Belo Horizonte, Brazil;Federal University of Minas Gerais, Belo Horizonte, Brazil

  • Venue:
  • Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Citations to publication venues in the form of journal, conference and workshop contain spelling variants, acronyms, abbreviated forms and misspellings, all of which make more difficult to retrieve the item of interest. The task of discovering and reconciling these variant forms of bibliographic references is known as authority work. The key goal is to create the so called authority files, which maintain, for any given bibliographic item, a list of variant labels (i.e., variant strings) used as a reference to it. In this paper we propose to use information available on the Web to create high quality publication venue authority files. Our idea is to recognize (and extract) references to publication venues in the text snippets of the answers returned by a search engine. References to a same publication venue are then reconciled in an authority file. Each entry in this file is composed of a canonical name for the venue, an acronym, the venue type (i.e., journal, conference, or workshop), and a mapping to various forms of writing its name in bibliographic citations. Experimental results show that our Web-based method for creating authority files is superior to previous work based on straight string matching techniques. Considering the average precision in finding correct venue canonical names, we observe gains up to 41.7%.