Explorations in Automatic Thesaurus Discovery
Explorations in Automatic Thesaurus Discovery
An open distributed architecture for reuse and integration of heterogeneous NLP components
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
The Talent system: TEXTRACT architecture and data model
Natural Language Engineering
Natural Language Engineering
Evolving GATE to meet new challenges in language engineering
Natural Language Engineering
KIM – a semantic platform for information extraction and retrieval
Natural Language Engineering
Event-based information extraction for the biomedical domain: the Caderige project
JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
Developing a robust part-of-speech tagger for biomedical text
PCI'05 Proceedings of the 10th Panhellenic conference on Advances in Informatics
Hi-index | 0.00 |
In the context of the ALVIS project, which aims at integrating linguistic information in topic-specific search engines, we develop a NLP architecture to linguistically annotate large collections of web documents. This context leads us to face the scalability aspect of Natural Language Processing. The platform can be viewed as a framework using existing NLP tools. We focus on the efficiency of the platform by distributing linguistic processing on several machines. We carry out an an experiment on 55,329 web documents focusing on biology. These 79 million-word collections of web documents have been processed in 3 days on 16 computers.