Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
A relational model of data for large shared data banks
Communications of the ACM
PowerDB-IR: information retrieval on top of a database cluster
Proceedings of the tenth international conference on Information and knowledge management
The ten commandments of data warehousing
ACM SIGMIS Database
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Efficient similarity search for market basket data
The VLDB Journal — The International Journal on Very Large Data Bases
Evaluating collaborative filtering recommender systems
ACM Transactions on Information Systems (TOIS)
Similarity between Euclidean and cosine angle distance for nearest neighbor queries
Proceedings of the 2004 ACM symposium on Applied computing
Using information retrieval techniques for supporting data mining
Data & Knowledge Engineering
C-store: a column-oriented DBMS
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Inverted files for text search engines
ACM Computing Surveys (CSUR)
Performance tradeoffs in read-optimized databases
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Neural Networks: A Comprehensive Foundation (3rd Edition)
Neural Networks: A Comprehensive Foundation (3rd Edition)
The VLDB Journal — The International Journal on Very Large Data Bases
Predicting individual disease risk based on medical history
Proceedings of the 17th ACM conference on Information and knowledge management
Hi-index | 0.01 |
Coded data sets are commonly used as compact representations of real world processes. Such data sets have been studied within various research fields from association mining, data warehousing, knowledge discovery, collaborative filtering to machine learning. However, previous studies on coded data sets have introduced methods for the analysis of rather small data sets. This study proposes applying information retrieval for enabling high performance analysis of data masses that scale beyond traditional approaches. Part of this PHD study focuses on new type of kernel projection functions that can be used to find similarities in spare discrete data spaces. This study presents experimental results how information retrieval indexes scale and outperform two common relational data schemas with a leading commercial DBMS for market basket analysis.