Word association norms, mutual information, and lexicography
Computational Linguistics
The automatic identification of stop words
Journal of Information Science
Authoritative sources in a hyperlinked environment
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
ACM Transactions on Asian Language Information Processing (TALIP)
Automatic thesaurus generation for Chinese documents
Journal of the American Society for Information Science and Technology
Extracting nested collocations
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
Automatic corpus-based Thai word extraction with the c4.5 learning algorithm
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Accessor variety criteria for Chinese word extraction
Computational Linguistics
A measure of term representativeness based on the number of co-occurring salient words
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Ranking algorithms for named-entity extraction: boosting and the voted perceptron
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
A simple but powerful automatic term extraction method
COMPUTERM '02 COLING-02 on COMPUTERM 2002: second international workshop on computational terminology - Volume 14
Two-character Chinese word extraction based on hybrid of internal and contextual measures
SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
HHMM-based Chinese lexical analyzer ICTCLAS
SIGHAN '03 Proceedings of the second SIGHAN workshop on Chinese language processing - Volume 17
Exploiting the Web as the multilingual corpus for unknown query translation
Journal of the American Society for Information Science and Technology
Domain-specific keyphrase extraction
IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Hi-index | 0.00 |
This article addresses a two-step approach for term extraction. In the first step on term candidate extraction, a new delimiter-based approach is proposed to identify features of the delimiters of term candidates rather than those of the term candidates themselves. This delimiter-based method is much more stable and domain independent than the previous approaches. In the second step on term verification, an algorithm using link analysis is applied to calculate the relevance between term candidates and the sentences from which the terms are extracted. All information is obtained from the working domain corpus without the need for prior domain knowledge. The approach is not targeted at any specific domain and there is no need for extensive training when applying it to new domains. In other words, the method is not domain dependent and it is especially useful for resource-limited domains. Evaluations of Chinese text in two different domains show quite significant improvements over existing techniques and also verify its efficiency and its relatively domain-independent nature. The proposed method is also very effective for extracting new terms so that it can serve as an efficient tool for updating domain knowledge, especially for expanding lexicons. © 2010 Wiley Periodicals, Inc.