Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Viewing morphology as an inference process
Artificial Intelligence - Special issue on Intelligent internet systems
CLEF 2005: ad hoc track overview
CLEF'05 Proceedings of the 6th international conference on Cross-Language Evalution Forum: accessing Multilingual Information Repositories
CICLing'06 Proceedings of the 7th international conference on Computational Linguistics and Intelligent Text Processing
Binary lexical relations for text representation in information retrieval
NLDB'05 Proceedings of the 10th international conference on Natural Language Processing and Information Systems
Tools for nominalization: an alternative for lexical normalization
PROPOR'06 Proceedings of the 7th international conference on Computational Processing of the Portuguese Language
Hi-index | 0.00 |
This paper presents the 2006 participation of the PUCRS NLP-Group in the CLEF Monolingual Ad Hoc Task for Portuguese. We took part in this campaign using the TR+ Model, which is based on nominalization, binary lexical relations (BLR), Boolean queries, and the evidence concept. Our alternative strategy for lexical normalization, the nominalization, is to transform a word (adjective, verb, or adverb) into a semantically corresponding noun. BLRs identify relationships between nominalized terms and capture phrasal cohesion mechanisms, like those between subject and predicate, subject and object (direct or indirect), noun and adjective or verb and adverb. In our strategy, an index unit (a descriptor) may be a single term or a BLR, and we adopt the evidence concept: the descriptor weighting depends on the occurrence of phrasal cohesion mechanisms, besides depending on frequency of occurrence. We describe these features, which implement lexical normalization and term dependence in an information retrieval system based on linguistic resources.