Design and analysis of genetic algorithm based Chinese keyword extracting

Authors:
Kai Gao;Hua-Ping Zhang;Yun-Feng Xu;Guo-Jiang Gao;Yang-Jie Li
Affiliations:
School of Information Science and Engineering, Hebei University of Science and Technology, Shijiazhuang city, Hebei Province 050051, China;School of Information Science and Engineering, Hebei University of Science and Technology, Shijiazhuang city, Hebei Province 050051, China;School of Information Science and Engineering, Hebei University of Science and Technology, Shijiazhuang city, Hebei Province 050051, China;School of Information Science and Engineering, Hebei University of Science and Technology, Shijiazhuang city, Hebei Province 050051, China;School of Information Science and Engineering, Hebei University of Science and Technology, Shijiazhuang city, Hebei Province 050051, China
Venue:
International Journal of Computer Applications in Technology
Year:
2013

Citing 15
Cited 0

Modern Information Retrieval

Modern Information Retrieval
Mining the Web: Discovering Knowledge from HyperText Data

Mining the Web: Discovering Knowledge from HyperText Data
Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data

Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data
Information Extraction: Distilling Structured Data from Unstructured Text

Queue - Social Computing
Chinese segmentation and new word detection using conditional random fields

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Japanese idiom recognition: drawing a line between literal and idiomatic meanings

COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Using lexical chains for keyword extraction

Information Processing and Management: an International Journal
Introduction to Information Retrieval

Introduction to Information Retrieval
Unsupervised type and token identification of idiomatic expressions

Computational Linguistics
Search Engines: Information Retrieval in Practice

Search Engines: Information Retrieval in Practice
Automatic identification of non-compositional multi-word expressions using latent semantic analysis

MWE '06 Proceedings of the Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties
Summarizing short stories

Computational Linguistics
A Unified Character-Based Tagging Framework for Chinese Word Segmentation

ACM Transactions on Asian Language Information Processing (TALIP)
Using Cartesian genetic programming to design wire antenna

International Journal of Computer Applications in Technology
Hybrid dynamic k-nearest-neighbour and distance and attribute weighted method for classification

International Journal of Computer Applications in Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Analysing and extracting useful knowledge effectively from the web data is becoming more and more important. As the weighted keywords can be considered as the condensed versions of documents, this paper presents the novel Chinese keyword extraction algorithm based on genetic algorithm, together with paragraph analysing, Chinese segmentation, synonymous and unlisted-term processing. On the basis of the genetic algorithm training and the lead of the extracted terms results given by the experts manually, the genetic algorithm based approach can present an optimised and useful results, especially in some domains. It can be used to train the term weights within the lexicons. The experimental results and the analysis show the feasibility of the approach.