On generalizing the Two-Poisson model
Journal of the American Society for Information Science
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Probabilistic models in information retrieval
The Computer Journal - Special issue on information retrieval
Two models of retrieval with probabilistic indexing
Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval
Clumping properties of content-bearing words
Journal of the American Society for Information Science
Foundations of statistical natural language processing
Foundations of statistical natural language processing
A probabilistic model of information retrieval: development and comparative experiments Part 2
Information Processing and Management: an International Journal
Learning Algorithms for Keyphrase Extraction
Information Retrieval
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval
ECML '98 Proceedings of the 10th European Conference on Machine Learning
Term Frequency Normalization via Pareto Distributions
Proceedings of the 24th BCS-IRSG European Colloquium on IR Research: Advances in Information Retrieval
Distribution of content words and phrases in text and language modelling
Natural Language Engineering
A method of measuring term representativeness: baseline method using co-occurrence distribution
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Identifying terms by their family and friends
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
The head-modifier principle and multilingual term extraction
Natural Language Engineering
Improved automatic keyword extraction given more linguistic knowledge
EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
TREC: Experiment and Evaluation in Information Retrieval (Digital Libraries and Electronic Publishing)
The Google Similarity Distance
IEEE Transactions on Knowledge and Data Engineering
Tree-Traversing Ant Algorithm for term clustering based on featureless similarities
Data Mining and Knowledge Discovery
Determining termhood for learning domain ontologies using domain prevalence and tendency
AusDM '07 Proceedings of the sixth Australasian conference on Data mining and analytics - Volume 70
Resources for Turkish morphological processing
Language Resources and Evaluation
Autonomous and adaptive identification of topics in unstructured text
KES'11 Proceedings of the 15th international conference on Knowledge-based and intelligent information and engineering systems - Volume Part II
Ontology learning from text: A look back and into the future
ACM Computing Surveys (CSUR)
Hi-index | 0.00 |
Term recognition identifies domain-relevant terms which are essential for discovering domain concepts and for the construction of terminologies required by a wide range of natural language applications. Many techniques have been developed in an attempt to numerically determine or quantify termhood based on term characteristics. Some of the apparent shortcomings of existing techniques are the ad-hoc combination of termhood evidence, mathematically-unfounded derivation of scores and implicit assumptions concerning term characteristics. We propose a probabilistic framework for formalising and combining qualitative evidence based on explicitly defined term characteristics to produce a new termhood measure. Our qualitative and quantitative evaluations demonstrate consistently better precision, recall and accuracy compared to three other existing ad-hoc measures.