Suffix arrays: a new method for on-line string searches
SIAM Journal on Computing
BoosTexter: A Boosting-based Systemfor Text Categorization
Machine Learning - Special issue on information retrieval
Linear Programming Boosting via Column Generation
Machine Learning
Discovering Best Variable-Length-Don't-Care Patterns
DS '02 Proceedings of the 5th International Conference on Discovery Science
Linear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications
CPM '01 Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching
Color Set Size Problem with Application to String Matching
CPM '92 Proceedings of the Third Annual Symposium on Combinatorial Pattern Matching
A practical algorithm to find the best subsequence patterns
Theoretical Computer Science
Text classification using string kernels
The Journal of Machine Learning Research
Replacing suffix trees with enhanced suffix arrays
Journal of Discrete Algorithms - SPIRE 2002
Convex Optimization
An O(N^2) Algorithm for Discovering Optimal Boolean Pattern Pairs
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Fast and space efficient string kernels using suffix arrays
ICML '06 Proceedings of the 23rd international conference on Machine learning
A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Fast logistic regression for text categorization with variable-length n-grams
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
ALT '08 Proceedings of the 19th international conference on Algorithmic Learning Theory
Linear pattern matching algorithms
SWAT '73 Proceedings of the 14th Annual Symposium on Switching and Automata Theory (swat 1973)
Linear Programming Boosting by Column and Row Generation
DS '09 Proceedings of the 12th International Conference on Discovery Science
Linear-time construction of suffix arrays
CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Space efficient linear time construction of suffix arrays
CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Simple linear work suffix array construction
ICALP'03 Proceedings of the 30th international conference on Automata, languages and programming
Hi-index | 0.00 |
In this paper, we consider finding a small set of substring patterns which classifies the given documents well. We formulate the problem as 1 norm soft margin optimization problem where each dimension corresponds to a substring pattern. Then we solve this problem by using LPBoost and an optimal substring discovery algorithm. Since the problem is a linear program, the resulting solution is likely to be sparse, which is useful for feature selection. We evaluate the proposed method for real data such as movie reviews.