Zipf and Heaps Laws' Coefficients Depend on Language

Authors:
Alexander F. Gelbukh;Grigori Sidorov
Affiliations:
-;-
Venue:
CICLing '01 Proceedings of the Second International Conference on Computational Linguistics and Intelligent Text Processing
Year:
2001

Citing 2
Cited 3

Foundations of statistical natural language processing

Foundations of statistical natural language processing
Language identification in unknown signals

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2

Decomposing background topics from keywords by principal component pursuit

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Distributions of functional and content words differ radically

MICAI'06 Proceedings of the 5th Mexican international conference on Artificial Intelligence
Local buffer as source of web mining data

KES'06 Proceedings of the 10th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part III

Quantified Score

Hi-index	0.00

Visualization

Abstract

We observed that the coefficients of two important empirical statistical laws of language - Zipf law and Heaps law - are different for different languages, as we illustrate on English and Russian examples. This may have both theoretical and practical implications. On the one hand, the reasons for this may shed light on the nature of language. On the other hand, these two laws are important in, say, full-text database design allowing predicting the index size.