Comparing corpora using frequency profiling
CompareCorpora '00 Proceedings of the Workshop on Comparing Corpora
Hi-index | 0.00 |
Collocations are understood in this work as the nonrandom combination of two or more lexical units that is typical for both a language as a whole (texts of any type) and a definite type of text. A text is a structured sequence of units of different levels; collocations, as complex text substructures, act as an important object when investigating text analysis procedures. In selecting collections of different types as materials, we study both the general patterns and properties of the analyzed collections. This paper devotes its main attention to digrams that were extracted from a collection of news texts.