C4.5: programs for machine learning
C4.5: programs for machine learning
A genetic algorithm method for optimizing fuzzy decision trees
Information Sciences: an International Journal
On automatic generation of multimedia presentations
Information Sciences: an International Journal
Efficient search for fuzzy models using genetic algorithm
Information Sciences—Informatics and Computer Science: An International Journal - Special issue on modeling with soft-computing
A document classification method by using field association words
Information Sciences—Informatics and Computer Science: An International Journal
Automatic test data generation for path testing using GAs
Information Sciences: an International Journal
Genetic Algorithms in Search, Optimization and Machine Learning
Genetic Algorithms in Search, Optimization and Machine Learning
A discretization algorithm based on Class-Attribute Contingency Coefficient
Information Sciences: an International Journal
Estimation of FAQ knowledge by classifying questions and answers
SMO'06 Proceedings of the 6th WSEAS International Conference on Simulation, Modelling and Optimization
Estimation of FAQ knowledge bases by using semantic expressions for questions and answers
International Journal of Computer Applications in Technology
Intelligent QA Systems Using Semantic Expressions
KES '09 Proceedings of the 13th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems: Part II
Expert Systems with Applications: An International Journal
Estimation of FAQ knowledge bases by introducing measurements
KES'06 Proceedings of the 10th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part II
Hi-index | 0.00 |
Numerous articles concerning computer related to new product news are present on the Internet. Information extraction and automatic text summarization are necessary for the effective use of these articles. The present paper reveals that the estimation of four sentence types (HATSUBAI [sales], SHIYO [specifications], KOZO [structure], KINO [function]) is an effective as preprocessing for information extraction and automatic text summarization. Moreover, this paper introduces a technique for estimating these sentence types using a decision tree. This decision tree does not involve proper nouns or technical terms but rather verbal nouns and numeratives at the end of sentences, as well as other general words, as attributes. Since sub-setting attribute values is important for creating the decision tree, the sub-setting of the representative decision tree algorithm C4.5 was revised. The gain ratio criterion was changed, and the hill climbing method was replaced with a genetic algorithm. A decision tree was created from 1539 sentences for learning data, and 299 sentences were estimated by the decision tree as test data. The number of incorrectly estimated sentences was 81 when C4.5 used without revision but these number decreased to 70 after revising the sub-setting.