Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Computing Geographical Scopes of Web Resources
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Web-a-where: geotagging web content
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
On assigning place names to geography related web pages
Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Ranking algorithms for named-entity extraction: boosting and the voted perceptron
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Introduction to the CoNLL-2002 shared task: language-independent named entity recognition
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Introduction to the CoNLL-2003 shared task: language-independent named entity recognition
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
A probabilistic approach to spatiotemporal theme pattern mining on weblogs
Proceedings of the 15th international conference on World Wide Web
World explorer: visualizing aggregate data from unstructured text in geo-referenced collections
Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
Proceedings of the 15th international conference on Multimedia
Mining geographic knowledge using location aware topic model
Proceedings of the 4th ACM workshop on Geographical information retrieval
Spirittagger: a geo-aware tag suggestion tool mined from flickr
MIR '08 Proceedings of the 1st ACM international conference on Multimedia information retrieval
Discovering users' specific geo intention in web search
Proceedings of the 18th international conference on World wide web
Proceedings of the 18th international conference on World wide web
Placing flickr photos on a map
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Collaborative location and activity recommendations with GPS history data
Proceedings of the 19th international conference on World wide web
Geographical topic discovery and comparison
Proceedings of the 20th international conference on World wide web
Modeling locations with social media
Information Retrieval
On the enrichment of a RDF repository of city points of interest based on social data
Proceedings of the 2nd International Workshop on Open Data
Towards precise POI localization with social media
Proceedings of the 21st ACM international conference on Multimedia
Hi-index | 0.00 |
A point of interest (POI) is a focused geographic entity such as a landmark, a school, an historical building, or a business. Points of interest are the basis for most of the data supporting location-based applications. In this paper we propose to curate POIs from online sources by bootstrapping training data from Web snippets, seeded by POIs gathered from social media. This large corpus is used to train a sequential tagger to recognize mentions of POIs in text. Using Wikipedia data as the training data, we can identify POIs in free text with an accuracy that is 116% better than the state of the art POI identifier in terms of precision, and 50% better in terms of recall. We show that using Foursquare and Gowalla checkins as seeds to bootstrap training data from Web snippets, we can improve precision between 16% and 52%, and recall between 48% and 187% over the state-of-the-art. The name of a POI is not sufficient, as the POI must also be associated with a set of geographic coordinates. Our method increases the number of POIs that can be localized nearly three-fold, from 134 to 395 in a sample of 400, with a median localization accuracy of less than one kilometer.