Static index pruning for information retrieval systems
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Algorithmics and applications of tree and graph searching
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Mining Molecular Fragments: Finding Relevant Substructures of Molecules
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Efficient Mining of Frequent Subgraphs in the Presence of Isomorphism
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
CloseGraph: mining closed frequent graph patterns
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Graph indexing: a frequent structure-based approach
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
A quickstart in frequent structure mining can make a difference
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining Generalized Substructures from a Set of Labeled Graphs
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Improving Web search efficiency via a locality based static pruning method
WWW '05 Proceedings of the 14th international conference on World Wide Web
Substructure similarity search in graph databases
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
IEEE Transactions on Pattern Analysis and Machine Intelligence
A document-centric approach to static index pruning in text retrieval systems
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Multi-task text segmentation and alignment based on weighted mutual information
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
XML structural delta mining: issues and challenges
Data & Knowledge Engineering - Special issue: ER 2003
Extraction and search of chemical formulae in text documents on the web
Proceedings of the 16th international conference on World Wide Web
Fg-index: towards verification-free query processing on graph databases
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Topic segmentation with shared topic detection and alignment of multiple documents
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
A platform based on the multi-dimensional data modal for analysis of bio-molecular structures
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Mining, indexing, and searching for textual chemical molecule information on the web
Proceedings of the 17th international conference on World Wide Web
A quantitative comparison of the subgraph miners mofa, gspan, FFSM, and gaston
PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
Using and learning semantics in frequent subgraph mining
WebKDD'05 Proceedings of the 7th international conference on Knowledge Discovery on the Web: advances in Web Mining and Web Usage Analysis
Identifying, Indexing, and Ranking Chemical Formulae and Chemical Names in Digital Documents
ACM Transactions on Information Systems (TOIS)
Ranking structural parameters for social networks
DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications
Lindex: a lattice-based index for graph databases
The VLDB Journal — The International Journal on Very Large Data Bases
Mining and indexing graphs for supergraph search
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
In order to enable scalable querying of graph databases, intelligent selection of subgraphs to index is essential. An improved index can reduce response times for graph queries significantly. For a given subgraph query, graph candidates that may contain the subgraph are retrieved using the graph index and subgraph isomorphism tests are performed to prune out unsatisfied graphs. However, since the space of all possible subgraphs of the whole set of graphs is prohibitively large, feature selection is required to identify a good subset of subgraph features for indexing. Thus, one of the key issues is: given the set of all possible subgraphs of the graph set, which subset of features is the optimal such that the algorithm retrieves the smallest set of candidate graphs and reduces the number of subgraph isomorphism tests? We introduce a graph search method for subgraph queries based on subgraph frequencies. Then, we propose several novel feature selection criteria, Max-Precision, Max-Irredundant-Information, and Max-Information-Min-Redundancy, based on mutual information. Finally we show theoretically and empirically that our proposed methods retrieve a smaller candidate set than previous methods. For example, using the same number of features, our method improve the precision for the query candidate set by 4%-13% in comparison to previous methods. As a result the response time of subgraph queries also is improved correspondingly.