Design of a browsing interface for information retrieval
SIGIR '89 Proceedings of the 12th annual international ACM SIGIR conference on Research and development in information retrieval
Efficient enumeration of frequent sequences
Proceedings of the seventh international conference on Information and knowledge management
Formal Concept Analysis: Mathematical Foundations
Formal Concept Analysis: Mathematical Foundations
Keyword Searching and Browsing in Databases using BANKS
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Efficient keyword search for smallest LCAs in XML databases
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
An efficient and versatile query engine for TopX search
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Effective keyword search in relational databases
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Yago: a core of semantic knowledge
Proceedings of the 16th international conference on World Wide Web
Towards keyword-driven analytical processing
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Discover: keyword search in relational databases
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Visualization of Heterogeneous Data
IEEE Transactions on Visualization and Computer Graphics
XSEarch: a semantic search engine for XML
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Efficient IR-style keyword search over relational databases
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Objectrank: authority-based keyword search in databases
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Automatically refining the wikipedia infobox ontology
Proceedings of the 17th international conference on World Wide Web
Damia: data mashups for intranet applications
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
What Have Innsbruck and Leipzig in Common? Extracting Semantics from Wiki Content
ESWC '07 Proceedings of the 4th European conference on The Semantic Web: Research and Applications
WebTables: exploring the power of tables on the web
Proceedings of the VLDB Endowment
Dynamic faceted search for discovery-driven analysis
Proceedings of the 17th ACM conference on Information and knowledge management
Automatic Extraction of Useful Facet Hierarchies from Text Databases
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
On the effectiveness of flexible querying heuristics for XML data
XSym'07 Proceedings of the 5th international conference on Database and XML Technologies
Hi-index | 0.01 |
Wikipedia infoboxes is an example of a seemingly structured, yet extraordinarily heterogenous dataset, where any given record has only a tiny fraction of all possible fields. Such data cannot be queried using traditional means without a massive a priori integration effort, since even for a simple request the result values span many record types and fields. On the other hand, the solutions based on keyword search are too imprecise to capture user's intent. To address these limitations, we propose a system, referred to herein as WikiAnalytics, that utilizes a novel search paradigm in order to derive tables of precise and complete results from Wikipedia infobox records. The user starts with a keyword search query that finds a superset of the result records, and then browses clusters of records deciding which are and are not relevant. WikiAnalytics uses three categories of clustering features based on record types, fields, and values that matched the query keywords, respectively. Since the system cannot predict which combination of features will be important to the user, it efficiently generates all possible clusters of records by all sets of features. We utilize a novel data structure, universal navigational lattice (UNL), that compactly encodes all possible clusters. WikiAnalytics provides a dynamic and intuitive interface that lets the user explore the UNL and construct homogeneous structured tables, which can be further queried and aggregated using the conventional tools.