Semantic similarity measure of polish nouns based on linguistic features

Authors:
Maciej Piasecki;Bartosz Broda
Affiliations:
Institute of Applied Informatics, Wrocław University of Technology, Wrocław, Poland;Institute of Applied Informatics, Wrocław University of Technology, Wrocław, Poland
Venue:
BIS'07 Proceedings of the 10th international conference on Business information systems
Year:
2007

Citing 12
Cited 5

Experiment on linguistically-based term associations

Information Processing and Management: an International Journal
Foundations of statistical natural language processing

Foundations of statistical natural language processing
Conceptual Spaces: The Geometry of Thought

Conceptual Spaces: The Geometry of Thought
Automatic Detection of Thesaurus relations for Information Retrieval Applications

Foundations of Computer Science: Potential - Theory - Cognition, to Wilfried Brauer on the occasion of his sixtieth birthday
Automatic word sense discrimination

Computational Linguistics - Special issue on word sense disambiguation
Similarity-based estimation of word cooccurrence probabilities

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Geometry and Meaning

Geometry and Meaning
Automatic Discovery of Part-Whole Relations

Computational Linguistics
Intelligent Information Processing and Web Mining: Proceedings of the International IIS: IIPWM'06 Conference held in Ustron, Poland, June 19-22, 2006 (Advances in Soft Computing)

Intelligent Information Processing and Web Mining: Proceedings of the International IIS: IIPWM'06 Conference held in Ustron, Poland, June 19-22, 2006 (Advances in Soft Computing)
Espresso: leveraging generic patterns for automatically harvesting semantic relations

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
New experiments in distributional representations of synonymy

CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
Effective architecture of the polish tagger

TSD'06 Proceedings of the 9th international conference on Text, Speech and Dialogue

Rank-Based Transformation in Measuring Semantic Relatedness

Canadian AI '09 Proceedings of the 22nd Canadian Conference on Artificial Intelligence: Advances in Artificial Intelligence
Automatic selection of heterogeneous syntactic features in semantic similarity of polish nouns

TSD'07 Proceedings of the 10th international conference on Text, speech and dialogue
Correction of medical handwriting OCR based on semantic similarity

IDEAL'07 Proceedings of the 8th international conference on Intelligent data engineering and automated learning
Towards semi-automatic extraction of lexical semantics relations for Polish

International Journal of Intelligent Information and Database Systems
WCCL: a morpho-syntactic feature toolkit

TSD'11 Proceedings of the 14th international conference on Text, speech and dialogue

Quantified Score

Hi-index	0.00

Visualization

Abstract

A word-to-word similarity function automatically extracted from a corpus of texts can be a very helpful tool in automatic extraction of lexical semantic relations. There are many approaches for English, but only a few for inflective languages with almost free word order. In the paper a method for the construction of a similarity function for Polish nouns is proposed. The method uses only simple tools for language processing (e.g. it does need the application of a parser). The core is the construction of a matrix of co-occurrences of nouns and adjectives on the basis of application of morpho-syntactic constraints testing agreement between an adjective and a noun. Several methods of transformation of the matrix and calculation of the similarity function are presented. The achieved accuracy of 81.15% in WordNet-based Synonymy Test (for 4 611 Polish nouns, using the current version of PolishWordNet) seems to be comparable with the best results reported for English (e.g. 75.8% [5]).