A practical approach to feature selection
ML92 Proceedings of the ninth international workshop on Machine learning
Machine Learning
Extracting Semantic Location from Outdoor Positioning Systems
MDM '06 Proceedings of the 7th International Conference on Mobile Data Management
An empirical comparison of supervised learning algorithms
ICML '06 Proceedings of the 23rd international conference on Machine learning
Query enrichment for web-query classification
ACM Transactions on Information Systems (TOIS)
LIBLINEAR: A Library for Large Linear Classification
The Journal of Machine Learning Research
High-level goal recognition in a wireless LAN
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Feature selection for ranking using boosted trees
Proceedings of the 18th ACM conference on Information and knowledge management
Stochastic gradient boosted distributed decision trees
Proceedings of the 18th ACM conference on Information and knowledge management
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
Using mobile phones to determine transportation modes
ACM Transactions on Sensor Networks (TOSN)
l1 regularization in infinite dimensional feature spaces
COLT'07 Proceedings of the 20th annual conference on Learning theory
Mining significant semantic locations from GPS data
Proceedings of the VLDB Endowment
The F# asynchronous programming model
PADL'11 Proceedings of the 13th international conference on Practical aspects of declarative languages
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
On the semantic annotation of places in location-based social networks
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Bayesian nonparametric modeling of user activities
Proceedings of the 2011 international workshop on Trajectory data mining and analysis
When recommendation meets mobile: contextual and personalized recommendation on the go
Proceedings of the 13th international conference on Ubiquitous computing
Learning location naming from user check-in histories
Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
Large-scale machine learning at twitter
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Parallel machine learning on big data
XRDS: Crossroads, The ACM Magazine for Students - Big Data
A few useful things to know about machine learning
Communications of the ACM
Automatically characterizing places with opportunistic crowdsensing using smartphones
Proceedings of the 2012 ACM Conference on Ubiquitous Computing
Hi-index | 0.00 |
We present in this paper our winning solution to Dedicated Task 1 in Nokia Mobile Data Challenge (MDC). MDC Task 1 is to infer the semantic category of a place based on the smartphone sensing data obtained at that place. We approach this task in a standard supervised learning setting: we extract discriminative features from the sensor data and use state-of-the-art classifiers (SVM, Logistic Regression and Decision Tree Family) to build classification models. We have found that feature engineering, or in other words, constructing features using human heuristics, is very effective for this task. In particular, we have proposed a novel feature engineering technique, Conditional Feature (CF), a general framework for domain-specific feature construction. In total, we have generated 2,796,200 features and in our final five submissions we use feature selection to select 100 to 2000 features. One of our key findings is that features conditioned on fine-granularity time intervals, e.g. every 30 min, are most effective. Our best 10-fold CV accuracy on training set is 75.1% by Gradient Boosted Trees, and the second best accuracy is 74.6% by L1-regularized Logistic Regression. Besides the good performance, we also report briefly our experience of using F# language for large-scale (~70 GB raw text data) conditional feature construction.