Survey of the state of the art in human language technology
Survey of the state of the art in human language technology
Automatic language identification
Survey of the state of the art in human language technology
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Fuzzy programming problem in the weakly structurable dynamic system and choice of decisions
WSEAS Transactions on Systems and Control
WSEAS Transactions on Information Science and Applications
Fuzzy covering problem based on the expert valuations
MMACTEE'09 Proceedings of the 11th WSEAS international conference on Mathematical methods and computational techniques in electrical engineering
Hi-index | 0.00 |
This paper presents a computational algorithm for machine classification of written languages using the Markov chain-based method for building language models and the fuzzy set theory-based normalization method to verify language. For a language document, each word is represented as a Markov chain of alphabetical letters. The initial probability and transition probabilities are calculated and the set of such probabilities obtained from the training data is referred to as the model of that language. Given an unknown text document and a claimed identity of a language, a similarity score based on fuzzy set theory is calculated and compared with a preset threshold. If the match is good enough, the identity claim is accepted. The proposed fuzzy normalization method is more effective for machine learning than the non-fuzzy normalization method, which has been widely used for speaker verification. Experimental results of verifying a set of seven closely roman-typed languages show the promising application of the proposed method.