Social (distributed) language modeling, clustering and dialectometry

Authors:
David Ellis
Affiliations:
Facebook, Palo Alto, CA
Venue:
TextGraphs-4 Proceedings of the 2009 Workshop on Graph-based Methods for Natural Language Processing
Year:
2009

Citing 4
Cited 1

The Alignment Template Approach to Statistical Machine Translation

Computational Linguistics
Creating algorithms for parsers and taggers for resource-poor languages using a related resource-rich language

Creating algorithms for parsers and taggers for resource-poor languages using a related resource-rich language
Using social annotations to improve language model for information retrieval

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Exploring social annotations for information retrieval

Proceedings of the 17th international conference on World Wide Web

User demographics and language in an implicit social network

EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present ongoing work in a scalable, distributed implementation of over 200 million individual language models, each capturing a single user's dialect in a given language (multilingual users have several models). These have a variety of practical applications, ranging from spam detection to speech recognition, and dialectometrical methods on the social graph. Users should be able to view any content in their language (even if it is spoken by a small population), and to browse our site with appropriately translated interface (automatically generated, for locales with little crowd-sourced community effort).