Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Tagging French: comparing a statistical and a constraint-based method
EACL '95 Proceedings of the seventh conference on European chapter of the Association for Computational Linguistics
MULTEXT: Multilingual Text Tools and Corpora
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 3
Minimal commitment and full lexical disambiguation: balancing rules and hidden Markov Models
ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Artificial Intelligence in Medicine
Hi-index | 0.00 |
In this paper we compare two types of corpus, focusing on the lexical ambiguity of each of them. The first corpus consists mainly of newspaper articles and literature excerpts, while the second belongs to the medical domain. To conduct the study, we have used two different disambiguation tools. However, first of all, we must verify the performance of each system in its respective application domain. We then use these systems in order to assess and compare both the general ambiguity rate and the particularities of each domain. Quantitative results show that medical documents are lexically less ambiguous than unrestricted documents. Our conclusions show the importance of the application area in the design of NLP tools.