Evaluation of a language identification system for mono- and multilingual text documents
Proceedings of the 2006 ACM symposium on Applied computing
From Web to Social Web: Discovering and Deploying User and Content Profiles
Comparing Chinese and German blogs
Proceedings of the 20th ACM conference on Hypertext and hypermedia
Improve feature selection method of web page language identification using fuzzy ARTMAP
International Journal of Intelligent Information and Database Systems
Retrieval effectiveness of cross language information retrieval search engines
ICADL'11 Proceedings of the 13th international conference on Asia-pacific digital libraries: for cultural heritage, knowledge dissemination, and future creation
Language identification in multi-lingual web-documents
NLDB'06 Proceedings of the 11th international conference on Applications of Natural Language to Information Systems
An exploratory study on search behavior in different languages
Proceedings of the 4th Information Interaction in Context Symposium
International Journal of Digital Library Systems
Hi-index | 0.00 |
This paper investigates the role of language in accessing information on the Internet. We combined data about website visitors through log-file analysis with data about web-hosts and links obtained from a crawler. Results suggest that language may represent a double barrier: first, the number of native speakers determines the number of web-hosts, and hence the amount of information and the interconnectedness of information sources. Second, to access information on a particular website the languages offered are an even more important factor than network effects: non-native speakers and links from websites in other languages are always underrepresented. Our results are in line with the Information Foraging Theory, the Revised Hierarchy Model, network and market theories, and emphasize the role of language on the Internet. Insight into these processes is helpful when website translation represents important investment decisions, or when aiming to diminish the digital divide.