Automatic text processing: the transformation, analysis, and retrieval of information by computer
Automatic text processing: the transformation, analysis, and retrieval of information by computer
Automatic thesaurus generation for an electronic community system
Journal of the American Society for Information Science
Combination and boundary detection approaches on Chinese indexing
Journal of the American Society for Information Science - Special topic issue on digital libraries: part 2
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Extracting taxonomic relationships from on-line definitional sources using LEXING
Proceedings of the 1st ACM/IEEE-CS joint conference on Digital libraries
Geographic Names: The Implementation of a Gazetteer in a Georeferenced Digital Library
Geographic Names: The Implementation of a Gazetteer in a Georeferenced Digital Library
Semantic tagging of unknown proper nouns
Natural Language Engineering
Extracting semantic hierarchies from a large on-line dictionary
ACL '85 Proceedings of the 23rd annual meeting on Association for Computational Linguistics
Positioning unknown words in a thesaurus by using information extracted from a corpus
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
Interpretation of proper nouns for information retrieval
HLT '93 Proceedings of the workshop on Human Language Technology
Concept-Based, Personalized Web Information Gathering: A Survey
KSEM '09 Proceedings of the 3rd International Conference on Knowledge Science, Engineering and Management
Hi-index | 0.00 |
ADL Gazetteer is a digitalized worldwide gazetteer developed in the Alexandria Digital Library (ADL) Project, which contains millions of geographic names (placenames). The placenames are indexed with type terms from the ADL Feature Type Thesaurus (FTT), a hierarchical category scheme. The paper proposes a two-step method to enrich the category scheme automatically: to discover frequent generic terms by detecting phase boundaries with a mutual information-based method, and to correlate the generic terms with the relevant type terms by hierarchical clustering. The correlation pair established can then be used to supplement the FTT with the generic terms found. The extensive experiments conducted on millions of ADLG placenames demonstrated the effectiveness of the proposed methods. Besides the thesaurus enrichment, the potential applications of this research include: to suggest likely type terms when categorizing new placenames, and to help users choose likely search terms.