Foundations of statistical natural language processing
Foundations of statistical natural language processing
Ontology Learning for the Semantic Web
Ontology Learning for the Semantic Web
Explorations in Automatic Thesaurus Discovery
Explorations in Automatic Thesaurus Discovery
EKAW '99 Proceedings of the 11th European Workshop on Knowledge Acquisition, Modeling and Management
Using JessTab to Integrate Protégé and Jess
IEEE Intelligent Systems
Retrieving collocations from text: Xtract
Computational Linguistics - Special issue on using large corpora: I
Towards a workbench for acquisition of domain knowledge from natural language
EACL '95 Proceedings of the seventh conference on European chapter of the Association for Computational Linguistics
Word association norms, mutual information, and lexicography
ACL '89 Proceedings of the 27th annual meeting on Association for Computational Linguistics
Extension of Zipf's law to words and phrases
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Semantic Information Processing
Semantic Information Processing
Corpus-based thesaurus construction for image retrieval in specialist domains
ECIR'03 Proceedings of the 25th European conference on IR research
Visualizing sequences of texts using collocational networks
MLDM'03 Proceedings of the 3rd international conference on Machine learning and data mining in pattern recognition
Self organization of a massive document collection
IEEE Transactions on Neural Networks
Using corpus analysis to inform research into opinion detection in blogs
AusDM '07 Proceedings of the sixth Australasian conference on Data mining and analytics - Volume 70
GUEST EDITORIAL: Intelligent data analysis in medicine-Recent advances
Artificial Intelligence in Medicine
Distributional lexical semantics for stop lists
IRSG'08 Proceedings of the 2008 BCS-IRSG conference on Corpus Profiling
Hi-index | 0.00 |
This paper discusses a consistency in patterns of language use across domain-specific collections of text. We present a method for the automatic identification of domain-specific keywords – specialist terms – based on comparing language use in scientific domain-specific text collections with language use in texts intended for a more general audience. The method supports automatic production of collocational networks, and of networks of concepts – thesauri, or so-called ontologies. The method involves a novel combination of existing metrics from work in computational linguistics, which can enable extraction, or learning, of these kinds of networks. Creation of ontologies or thesauri is informed by international (ISO) standards in terminology science, and the resulting resource can be used to support a variety of work, including data-mining applications.