A language modeling approach to information retrieval
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Geo-word centric association rule mining
Proceedings of the 6th international conference on Mobile data management
A term recognition approach to acronym recognition
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Identifying location in indonesian documents for geographic information retrieval
Proceedings of the 4th ACM workshop on Geographical information retrieval
A differential notion of place for local search
Proceedings of the first international workshop on Location and the web
AMAP: automatically mining abbreviation expansions in programs to enhance software maintenance tools
Proceedings of the 2008 international working conference on Mining software repositories
On metonymy recognition for geographic information retrieval
International Journal of Geographical Information Science
A discriminative alignment model for abbreviation recognition
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
EACL '06 Proceedings of the Eleventh Conference of the European Chapter of the Association for Computational Linguistics: Posters & Demonstrations
Chatter on the red: what hazards threat reveals about the social life of microblogged information
Proceedings of the 2010 ACM conference on Computer supported cooperative work
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
AcroDef: a quality measure for discriminating expansions of ambiguous acronyms
CONTEXT'07 Proceedings of the 6th international and interdisciplinary conference on Modeling and using context
A latent variable model for geographic lexical variation
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
You are where you tweet: a content-based approach to geo-locating twitter users
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Semantic twitter: analyzing tweets for real-time event notification
BlogTalk'08/09 Proceedings of the 2008/2009 international conference on Social software: recent trends and developments in social software
Toponym resolution in social media
ISWC'10 Proceedings of the 9th international semantic web conference on The semantic web - Volume Part I
ICE-TEA: in-context expansion and translation of English abbreviations
CICLing'11 Proceedings of the 12th international conference on Computational linguistics and intelligent text processing - Volume Part II
Tweets from Justin Bieber's heart: the dynamics of the location field in user profiles
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Recognizing named entities in tweets
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Simple supervised document geolocation with geodesic grids
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Multifaceted toponym recognition for streaming news
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Using second-order vectors in a knowledge-based method for acronym disambiguation
CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
Expansion finding for given acronyms using conditional random fields
WAIM'11 Proceedings of the 12th international conference on Web-age information management
ASONAM '11 Proceedings of the 2011 International Conference on Advances in Social Networks Analysis and Mining
Acronym Expansion Via Hidden Markov Models
ICSENG '11 Proceedings of the 2011 21st International Conference on Systems Engineering
Proceedings of the 20th ACM international conference on Information and knowledge management
"I'm eating a sandwich in Glasgow": modeling locations with tweets
Proceedings of the 3rd international workshop on Search and mining user-generated contents
Processing and visualizing the data in tweets
ACM SIGMOD Record
ESCIENCEW '11 Proceedings of the 2011 IEEE Seventh International Conference on e-Science Workshops
Using syntactic and semantic structural kernels for classifying definition questions in Jeopardy!
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Named entity recognition in tweets: an experimental study
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A supervised learning approach to acronym identification
AI'05 Proceedings of the 18th Canadian Society conference on Advances in Artificial Intelligence
Construction of a Japanese gazetteers for Japanese local toponym disambiguation
Proceedings of the 7th Workshop on Geographic Information Retrieval
Cross-lingual geo-parsing for non-structured data
Proceedings of the 7th Workshop on Geographic Information Retrieval
Hi-index | 0.00 |
The location of the author of a social media message is not invariably the same as the location that the author writes about in the message. In applications that mine these messages for information such as tracking news, political events or responding to disasters, it is the geographic content of the message rather than the location of the author that is important. To this end, we present a method to geo-parse the short, informal messages known as microtext. Our preliminary investigation has shown that many microtext messages contain place references that are abbreviated, misspelled, or highly localized. These references are missed by standard geo-parsers. Our geo-parser is built to find such references. It uses Natural Language Processing methods to identify references to streets and addresses, buildings and urban spaces, and toponyms, and place acronyms and abbreviations. It combines heuristics, open-source Named Entity Recognition software, and machine learning techniques. Our primary data consisted of Twitter messages sent immediately following the February 2011 earthquake in Christchurch, New Zealand. The algorithm identified location in the data sample, Twitter messages, giving an F statistic of 0.85 for streets, 0.86 for buildings, 0.96 for toponyms, and 0.88 for place abbreviations, with a combined average F of 0.90 for identifying places. The same data run through a geo-parsing standard, Yahoo! Placemaker, yielded an F statistic of zero for streets and buildings (because Placemaker is designed to find neither streets nor buildings), and an F of 0.67 for toponyms.