UA-ZSA: web page clustering on the basis of name disambiguation
SemEval '07 Proceedings of the 4th International Workshop on Semantic Evaluations
Hi-index | 0.00 |
We present the Dossier-GPLSI, a system for the automatic generation of press dossiers for organizations. News are downloaded from online newspapers and are automatically classified. We describe specifically a module for the discrimination of person names. Three different approaches are analyzed and evaluated, each one using different kind of information, as semantic information, domain information and statistical evidence. We demonstrate that this module reaches a very good performance, and can be integrated in the Dossier-GPLSI system.