KEA: practical automatic keyphrase extraction
Proceedings of the fourth ACM conference on Digital libraries
The Journal of Machine Learning Research
Retrieval evaluation with incomplete information
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
AutoTag: a collaborative approach to automated tag assignment for weblog posts
Proceedings of the 15th international conference on World Wide Web
Thesaurus based automatic keyphrase indexing
Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Subject metadata enrichment using statistical topic models
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
Real-time automatic tag recommendation
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to Information Retrieval
Introduction to Information Retrieval
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Proceedings of the 18th international conference on World wide web
Latent dirichlet allocation for tag recommendation
Proceedings of the third ACM conference on Recommender systems
On smoothing and inference for topic models
UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Online multiscale dynamic topic models
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Automatic keyphrase extraction via topic decomposition
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Linking archives using document enrichment and term selection
TPDL'11 Proceedings of the 15th international conference on Theory and practice of digital libraries: research and advanced technology for digital libraries
A simple word trigger method for social tag suggestion
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Recommending citations: translating papers into references
Proceedings of the 21st ACM international conference on Information and knowledge management
Discovering health-related knowledge in social media using ensembles of heterogeneous features
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.00 |
The increase of the complexity and advancement in ecological and environmental sciences encourages scientists across the world to collect data from multiple places, times, and thematic scales to verify their hypotheses. Accumulated over time, such data not only increases in amount, but also in the diversity of the data sources spread around the world. This poses a huge challenge for scientists who have to manually search for information. To alleviate such problems, ONEMercury has recently been implemented as part of the DataONE project to serve as a portal for accessing environmental and observational data across the globe. ONEMercury harvests metadata from the data hosted by multiple repositories and makes it searchable. However, harvested metadata records sometimes are poorly annotated or lacking meaningful keywords, which could affect effective retrieval. Here, we develop algorithms for automatic annotation of metadata. We transform the problem into a tag recommendation problem with a controlled tag library, and propose two variants of an algorithm for recommending tags. Our experiments on four datasets of environmental science metadata records not only show great promises on the performance of our method, but also shed light on the different natures of the datasets.