Markov and fuzzy models for written language verification

Authors:
Dat T. Tran;Tuan D. Pham
Affiliations:
School of Information Sciences and Engineering, University of Canberra, Canberra, ACT, Australia;School of Information Technology, James Cook University, Townsville, QLD, Australia
Venue:
FS'05 Proceedings of the 6th WSEAS international conference on Fuzzy systems
Year:
2005

Citing 3
Cited 3

Survey of the state of the art in human language technology

Survey of the state of the art in human language technology
Automatic language identification

Survey of the state of the art in human language technology
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms

Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms

Fuzzy programming problem in the weakly structurable dynamic system and choice of decisions

WSEAS Transactions on Systems and Control
Fuzzy covering and partitioning problems based on the expert valuations: application in optimal choice of candidates

WSEAS Transactions on Information Science and Applications
Fuzzy covering problem based on the expert valuations

MMACTEE'09 Proceedings of the 11th WSEAS international conference on Mathematical methods and computational techniques in electrical engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a computational algorithm for machine classification of written languages using the Markov chain-based method for building language models and the fuzzy set theory-based normalization method to verify language. For a language document, each word is represented as a Markov chain of alphabetical letters. The initial probability and transition probabilities are calculated and the set of such probabilities obtained from the training data is referred to as the model of that language. Given an unknown text document and a claimed identity of a language, a similarity score based on fuzzy set theory is calculated and compared with a preset threshold. If the match is good enough, the identity claim is accepted. The proposed fuzzy normalization method is more effective for machine learning than the non-fuzzy normalization method, which has been widely used for speaker verification. Experimental results of verifying a set of seven closely roman-typed languages show the promising application of the proposed method.