A node indexing scheme for web entity retrieval

Authors:
Renaud Delbru;Nickolai Toupikov;Michele Catasta;Giovanni Tummarello
Affiliations:
Digital Enterprise Research Institute, National University of Ireland, Galway, Galway, Ireland;Digital Enterprise Research Institute, National University of Ireland, Galway, Galway, Ireland;School of Computer and Communication Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland;,Digital Enterprise Research Institute, National University of Ireland, Galway, Galway, Ireland
Venue:
ESWC'10 Proceedings of the 7th international conference on The Semantic Web: research and Applications - Volume Part II
Year:
2010

Citing 22
Cited 10

Query evaluation techniques for large databases

ACM Computing Surveys (CSUR)
Self-indexing inverted files for fast text retrieval

ACM Transactions on Information Systems (TOIS)
Managing gigabytes (2nd ed.): compressing and indexing documents and images

Managing gigabytes (2nd ed.): compressing and indexing documents and images
Integrating contents and structure in text retrieval

ACM SIGMOD Record
Storing and querying ordered XML using a relational database system

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Indexing and Querying XML Data for Regular Path Expressions

Proceedings of the 27th International Conference on Very Large Data Bases
Dynamic maintenance of web indexes using landmarks

WWW '03 Proceedings of the 12th international conference on World Wide Web
On labeling schemes for the semantic web

WWW '03 Proceedings of the 12th international conference on World Wide Web
B-tree indexes for high update rates

ACM SIGMOD Record
Dual Labeling: Answering Graph Reachability Queries in Constant Time

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Inverted files for text search engines

ACM Computing Surveys (CSUR)
Indexing dataspaces

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
ESTER: efficient search on text, entities, and relations

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Efficiently Querying Large XML Data Repositories: A Survey

IEEE Transactions on Knowledge and Data Engineering
Scalable semantic web data management using vertical partitioning

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Introduction to Information Retrieval

Introduction to Information Retrieval
RDF-3X: a RISC-style engine for RDF

Proceedings of the VLDB Endowment
Hexastore: sextuple indexing for semantic web data management

Proceedings of the VLDB Endowment
Sindice.com: a document-oriented lookup index for open linked data

International Journal of Metadata, Semantics and Ontologies
Semplore: A scalable IR approach to search the Web of Data

Web Semantics: Science, Services and Agents on the World Wide Web
YARS2: a federated repository for querying graph structured data from the web

ISWC'07/ASWC'07 Proceedings of the 6th international The semantic web and 2nd Asian conference on Asian semantic web conference
Structured index organizations for high-throughput text querying

SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval

Querying linked data using semantic relatedness: a vocabulary independent approach

NLDB'11 Proceedings of the 16th international conference on Natural language processing and information systems
Comparing data summaries for processing live queries over Linked Data

World Wide Web
Searching and browsing Linked Data with SWSE: The Semantic Web Search Engine

Web Semantics: Science, Services and Agents on the World Wide Web
Searching web data: An entity retrieval and high-performance indexing model

Web Semantics: Science, Services and Agents on the World Wide Web
PatentRank: an ontology-based approach to patent search

ICONIP'11 Proceedings of the 18th international conference on Neural Information Processing - Volume Part II
On the modeling of entities for ad-hoc entity search in the web of data

ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
Combining inverted indices and structured search for ad-hoc object retrieval

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Building semantic information search platform with extended Sesame framework

Proceedings of the 8th International Conference on Semantic Systems
Capturing and Sharing Scientific Research Data

Proceedings of the 13th International Conference on Knowledge Management and Knowledge Technologies
Editorial: Querying linked data graphs using semantic relatedness: A vocabulary independent approach

Data & Knowledge Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Now motivated also by the partial support of major search engines, hundreds of millions of documents are being published on the web embedding semi-structured data in RDF, RDFa and Microformats. This scenario calls for novel information search systems which provide effective means of retrieving relevant semi-structured information. In this paper, we present an “entity retrieval system” designed to provide entity search capabilities over datasets as large as the entire Web of Data. Our system supports full-text search, semi-structural queries and top-k query results while exhibiting a concise index and efficient incremental updates. We advocate the use of a node indexing scheme and show that it offers a good compromise between query expressiveness, query processing time and update complexity in comparison to three other indexing techniques. We then demonstrate how such system can effectively answer queries over 10 billion triples on a single commodity machine.