CiteSeer: an automatic citation indexing system
Proceedings of the third ACM conference on Digital libraries
Acrophile: an automated acronym extractor and server
DL '00 Proceedings of the fifth ACM conference on Digital libraries
Using clustering strategies for creating authority files
Journal of the American Society for Information Science
Automated name authority control
Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries
Mining the Web: Discovering Knowledge from HyperText Data
Mining the Web: Discovering Knowledge from HyperText Data
Methods for precise named entity matching in digital collections
Proceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries
Two supervised learning approaches for name disambiguation in author citations
Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries
Acquisition of categorized named entities for web search
Proceedings of the thirteenth ACM international conference on Information and knowledge management
Comparative study of name disambiguation problem using a scalable blocking-based framework
Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
A web-based kernel function for measuring the similarity of short text snippets
Proceedings of the 15th international conference on World Wide Web
POLYPHONET: an advanced social network extraction system from the web
Proceedings of the 15th international conference on World Wide Web
Search engine driven author disambiguation
Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Measuring semantic similarity between words using web search engines
Proceedings of the 16th international conference on World Wide Web
Efficient topic-based unsupervised name disambiguation
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
International Journal on Digital Libraries
Efficient name disambiguation for large-scale databases
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Using web information for author name disambiguation
Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries
Hi-index | 0.00 |
Citations to publication venues in the form of journal, conference and workshop contain spelling variants, acronyms, abbreviated forms and misspellings, all of which make more difficult to retrieve the item of interest. The task of discovering and reconciling these variant forms of bibliographic references is known as authority work. The key goal is to create the so called authority files, which maintain, for any given bibliographic item, a list of variant labels (i.e., variant strings) used as a reference to it. In this paper we propose to use information available on the Web to create high quality publication venue authority files. Our idea is to recognize (and extract) references to publication venues in the text snippets of the answers returned by a search engine. References to a same publication venue are then reconciled in an authority file. Each entry in this file is composed of a canonical name for the venue, an acronym, the venue type (i.e., journal, conference, or workshop), and a mapping to various forms of writing its name in bibliographic citations. Experimental results show that our Web-based method for creating authority files is superior to previous work based on straight string matching techniques. Considering the average precision in finding correct venue canonical names, we observe gains up to 41.7%.