Conquering Language: Using NLP on a Massive Scale to Build High Dimensional Language Models from the Web

  • Authors:
  • Gregory Grefenstette

  • Affiliations:
  • Commissariat à l'Energie Atomique, CEA LIST, SRCI, BP 6, 92265 Fontenay aux Roses Cedex, France

  • Venue:
  • CICLing '07 Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Dictionaries only contain some of the information we need to know about a language. The growth of the Web, the maturation of linguistic processing tools, and the decline in price of memory storage allow us to envision descriptions of languages that are much larger than before. We can conceive of building a complete language model for a language using all the text that is found on the Web for this language. This article describes our current project to do just that.