A hybrid index structure for set-valued attributes using itemset tree and inverted list

Authors:
Shahriyar Hossain;Hasan Jamil
Affiliations:
Department of Computer Science, Wayne State University, Michigan;Department of Computer Science, Wayne State University, Michigan
Venue:
DEXA'10 Proceedings of the 21st international conference on Database and expert systems applications: Part I
Year:
2010

Citing 14
Cited 0

Signature files

Information retrieval
Evaluation of signature files as set access facilities in OODBs

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Metadata for multimedia documents

ACM SIGMOD Record
Metadata in video databases

ACM SIGMOD Record
On supporting containment queries in relational database management systems

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
The UCI KDD archive of large data sets for data mining research and experimentation

ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
Indexing Techniques for Advanced Database Systems

Indexing Techniques for Advanced Database Systems
Incremental Discovering Association Rules: A Concept Lattice Approach

PAKDD '99 Proceedings of the Third Pacific-Asia Conference on Methodologies for Knowledge Discovery and Data Mining
Efficient processing of joins on set-valued attributes

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
A performance study of four index structures for set-valued attributes of low cardinality

The VLDB Journal — The International Journal on Very Large Data Bases
Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach

Data Mining and Knowledge Discovery
On the Signature Tree Construction and Analysis

IEEE Transactions on Knowledge and Data Engineering
A combination of trie-trees and inverted files for the indexing of set-valued attributes

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
A framework for incremental generation of closed itemsets

Discrete Applied Mathematics

Quantified Score

Hi-index	0.00

Visualization

Abstract

The use of set-valued objects is becoming increasingly commonplace in modern application domains, multimedia, genetics, the stock market, etc. Recent research on set indexing has focused mainly on containment joins and data mining without considering basic set operations on set-valued attributes. In this paper, we propose a novel indexing scheme for processing superset, subset and equality queries on set-valued attributes. The proposed index structure is a hybrid of itemset-transaction set tree of "frequent items" and an inverted list of "infrequent items" that take advantage of the developments in itemset research in data mining. In this hybrid scheme, the expectation is that basic set operations with frequent low cardinality sets will yield superior retrieval performance and avoid the high costs of construction and maintenance of item-set tree for infrequent large item-sets. We demonstrate, through extensive experiments, that the proposed method performs as expected, and yields superior overall performance compared to the state of the art indexing scheme for set-valued attributes, i.e., inverted lists.