A genetic programming approach for robust language interpretation
Advances in genetic programming
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Genetic Programming and Evolvable Machines
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Genetic Programming For Attribute Construction In Data Mining
GECCO '02 Proceedings of the Genetic and Evolutionary Computation Conference
Genetic Programming with a Genetic Algorithm for Feature Construction and Selection
Genetic Programming and Evolvable Machines
A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Stylistic text classification using functional lexical features: Research Articles
Journal of the American Society for Information Science and Technology
Evolving Lucene search queries for text classification
Proceedings of the 9th annual conference on Genetic and evolutionary computation
Using learning to facilitate the evolution of features for recognizing visual concepts
Evolutionary Computation
Interactive annotation learning with indirect feature voting
SRWS '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Student Research Workshop and Doctoral Consortium
Generalizing dependency features for opinion mining
ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Improving query expansion with stemming terms: a new genetic algorithm approach
EvoCOP'08 Proceedings of the 8th European conference on Evolutionary computation in combinatorial optimization
Efficient convolution kernels for dependency and constituent syntactic trees
ECML'06 Proceedings of the 17th European conference on Machine Learning
Sentiment classification using automatically extracted subgraph features
CAAGET '10 Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text
Modeling of stylistic variation in social media with stretchy patterns
DIALECTS '11 Proceedings of the First Workshop on Algorithms and Resources for Modelling of Dialects and Language Varieties
Hi-index | 0.00 |
Feature space design is a critical part of machine learning. This is an especially difficult challenge in the field of text classification, where an arbitrary number of features of varying complexity can be extracted from documents as a preprocessing step. A challenge for researchers has consistently been to balance expressiveness of features with the size of the corresponding feature space, due to issues with data sparsity that arise as feature spaces grow larger. Drawing on past successes utilizing genetic programming in similar problems outside of text classification, we propose and implement a technique for constructing complex features from simpler features, and adding these more complex features into a combined feature space which can then be utilized by more sophisticated machine learning classifiers. Applying this technique to a sentiment analysis problem, we show encouraging improvement in classification accuracy, with a small and constant increase in feature space size. We also show that the features we generate carry far more predictive power than any of the simple features they contain.