Foundations of statistical natural language processing
Foundations of statistical natural language processing
Introduction to the special issue on the web as corpus
Computational Linguistics - Special issue on web as corpus
Using the web to obtain frequencies for unseen bigrams
Computational Linguistics - Special issue on web as corpus
Automatic learning for semantic collocation
ANLC '92 Proceedings of the third conference on Applied natural language processing
Collocation extraction based on modifiability statistics
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Measurements of lexico-syntactic cohesion by means of internet
MICAI'05 Proceedings of the 4th Mexican international conference on Advances in Artificial Intelligence
Hi-index | 0.00 |
An experiment on Web-assisted detection and correction of malapropism is reported. Malapropos words semantically destroy collocations they are in, usually with retention of syntactical links with other words. A hundred English malapropisms were gathered, each supplied with its correction candidates, i.e. word combinations with one word equal to an editing variant of the corresponding word in the malapropism. Google statistics of occurrences and co-occurrences were gathered for each malapropism and correcting candidate. The collocation components may be adjacent or separated by other words in a sentence, so statistics were accumulated for the most probable distance between them. The raw Google occurrence statistics are then recalculated to numeric values of a specially defined Semantic Compatibility Index (SCI). Heuristic rules are proposed to signal malapropisms when SCI values are lower than a predetermined threshold and to retain a few highly SCI-ranked correction candidates. Within certain limitations, the experiment gave promising results.