A study of smoothing methods for language models applied to Ad Hoc information retrieval
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
A quickstart in frequent structure mining can make a difference
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Multi-task text segmentation and alignment based on weighted mutual information
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Feature-based similarity search in graph structures
ACM Transactions on Database Systems (TODS)
XML structural delta mining: issues and challenges
Data & Knowledge Engineering - Special issue: ER 2003
Extraction and search of chemical formulae in text documents on the web
Proceedings of the 16th international conference on World Wide Web
Topic segmentation with shared topic detection and alignment of multiple documents
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
A regression framework for learning ranking functions using relative relevance judgments
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Mining, indexing, and searching for textual chemical molecule information on the web
Proceedings of the 17th international conference on World Wide Web
Identifying, Indexing, and Ranking Chemical Formulae and Chemical Names in Digital Documents
ACM Transactions on Information Systems (TOIS)
Hi-index | 0.01 |
Many applications in structure matching require the ability to search for graphs that are similar to a query graph, i.e., similarity graph queries. Prior works, especially in chemoinformatics, have used the maximum common edge subgraph (MCEG) to compute the graph similarity. This approach is prohibitively slow for real-time queries. In this work, we propose an algorithm that extracts and indexes subgraph features from a graph dataset. It computes the similarity of graphs using a linear graph kernel based on feature weights learned offline from a training set generated using MCEG. We show empirically that our proposed algorithm of learning to rank graphs can achieve higher normalized discounted cumulative gain compared with existing optimal methods based on MCEG. The running time of our algorithm is orders of magnitude faster than these existing methods.