Extracting Patterns and Relations from the World Wide Web
WebDB '98 Selected papers from the International Workshop on The World Wide Web and Databases
Automatic acquisition of hyponyms from large text corpora
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
A search engine for natural language applications
WWW '05 Proceedings of the 14th international conference on World Wide Web
Discovering relations among named entities from large corpora
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
An n-gram frequency database reference to handle MWE extraction in NLP applications
MWE '11 Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World
Hi-index | 0.00 |
In this paper, we will describe a search tool for a huge set of ngrams. The tool supports queries with an arbitrary number of wildcards. It takes a fraction of a second for a search, and can provide the fillers of the wildcards. The system runs on a single Linux PC with reasonable size memory (less than 4GB) and disk space (less than 400GB). This system can be a very useful tool for linguistic knowledge discovery and other NLP tasks.