Document retrieval for question answering: a quantitative evaluation of text preprocessing
Proceedings of the ACM first Ph.D. workshop in CIKM
Priberam’s question answering system for portuguese
CLEF'05 Proceedings of the 6th international conference on Cross-Language Evalution Forum: accessing Multilingual Information Repositories
20th century esfinge (sphinx) solving the riddles at CLEF 2005
CLEF'05 Proceedings of the 6th international conference on Cross-Language Evalution Forum: accessing Multilingual Information Repositories
Hi-index | 0.00 |
A mixed corpus of Portuguese is one in which texts of different origins produce different spelling variants for the same word. A new norm, which will bring together the written texts produced both in Portugal and Brazil, giving then a more uniform orthography, has been effective since 2009, but what happens in the perspective of search, to corpora created before the norm came into practice, or within the transition period? Is the information they contain outdated and worthless? Do they need to be converted to the new norm? In the present work we analyse these questions.