Information retrieval
Evaluation of signature files as set access facilities in OODBs
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Metadata for multimedia documents
ACM SIGMOD Record
ACM SIGMOD Record
On supporting containment queries in relational database management systems
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
The UCI KDD archive of large data sets for data mining research and experimentation
ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
Indexing Techniques for Advanced Database Systems
Indexing Techniques for Advanced Database Systems
Incremental Discovering Association Rules: A Concept Lattice Approach
PAKDD '99 Proceedings of the Third Pacific-Asia Conference on Methodologies for Knowledge Discovery and Data Mining
Efficient processing of joins on set-valued attributes
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
A performance study of four index structures for set-valued attributes of low cardinality
The VLDB Journal — The International Journal on Very Large Data Bases
Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach
Data Mining and Knowledge Discovery
On the Signature Tree Construction and Analysis
IEEE Transactions on Knowledge and Data Engineering
A combination of trie-trees and inverted files for the indexing of set-valued attributes
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
A framework for incremental generation of closed itemsets
Discrete Applied Mathematics
Hi-index | 0.00 |
The use of set-valued objects is becoming increasingly commonplace in modern application domains, multimedia, genetics, the stock market, etc. Recent research on set indexing has focused mainly on containment joins and data mining without considering basic set operations on set-valued attributes. In this paper, we propose a novel indexing scheme for processing superset, subset and equality queries on set-valued attributes. The proposed index structure is a hybrid of itemset-transaction set tree of "frequent items" and an inverted list of "infrequent items" that take advantage of the developments in itemset research in data mining. In this hybrid scheme, the expectation is that basic set operations with frequent low cardinality sets will yield superior retrieval performance and avoid the high costs of construction and maintenance of item-set tree for infrequent large item-sets. We demonstrate, through extensive experiments, that the proposed method performs as expected, and yields superior overall performance compared to the state of the art indexing scheme for set-valued attributes, i.e., inverted lists.