Corpus processing for lexical acquisition
Corpus processing for lexical acquisition
Explorations in Automatic Thesaurus Discovery
Explorations in Automatic Thesaurus Discovery
TétraFusion: Information Discovery on the Internet
IEEE Intelligent Systems
Wrapper induction for information extraction
Wrapper induction for information extraction
An intelligent multilingual information browsing and retrieval system using information extraction
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Mining free text for structure
Data mining
Webpage understanding: an integrated approach
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Hi-index | 0.00 |
Because of their constant renewal, it is necessary to acquire fresh named entities (NEs) from recent text sources. We present a tool for the acquisition and the typing of NEs from the Web that associates a harvester and three parallel shallow parsers dedicated to specific structures (lists, enumerations, and anchors). The parsers combine lexical indices such as discourse markers with formatting instructions (HTML tags) for analyzing enumerations and associated initializers.