Identifying synonymous concepts in preparation for technology mining

Authors:
Cherie Courseault Trumbach;Dinah Payne
Affiliations:
Department of Management, University of New Orleans,New Orleans, USA;Department of Management, University of New Orleans,New Orleans, USA
Venue:
Journal of Information Science
Year:
2007

Citing 13
Cited 2

Scatter/Gather: a cluster-based approach to browsing large document collections

SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Evaluating the novelty of text-mined rules using lexical knowledge

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
An algorithm for term conflation based on tree structures

Journal of the American Society for Information Science and Technology
Collection statistics for fast duplicate document detection

ACM Transactions on Information Systems (TOIS)
Exploiting the Similarity of Non-Matching Terms at RetrievalTime

Information Retrieval
Strong similarity measures for ordered sets of documents in information retrieval

Information Processing and Management: an International Journal
Text Mining at the Term Level

PKDD '98 Proceedings of the Second European Symposium on Principles of Data Mining and Knowledge Discovery
Text Mining: A New Frontier for Lossless Compression

DCC '99 Proceedings of the Conference on Data Compression
A Multi-Level Text Mining Method to Extract Biological Relationships

CSB '02 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Unsupervised word sense disambiguation rivaling supervised methods

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Extracting noun phrases from large-scale texts: a hybrid approach and its automatic evaluation

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Corpus-dependent association thesauri for information retrieval

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
A compression algorithm using integrated record information for translation dictionaries

Information Sciences—Informatics and Computer Science: An International Journal - Special issue: Informatics and computer science intelligent systems applications

Mining sequential patterns in the B2B environment

Journal of Information Science
Distributional lexical semantics for stop lists

IRSG'08 Proceedings of the 2008 BCS-IRSG conference on Corpus Profiling

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this research, the development of a `concept-clumping algorithm' designed to improve the clustering of technical concepts is demonstrated . The algorithm developed first identifies a list of technically relevant noun phrases from a cleaned extracted list and then applies a rule-based algorithm for identifying synonymous terms based on shared words in each term. An assessment of the algorithm found that the algorithm has an 89—91% precision rate, was successful in moving technically important terms higher in the term frequency list, and improved the technical specificity of term clusters.