Search engine support for software applications

Authors:
Jamie Callan
Affiliations:
Carnegie Mellon University, Pittsburgh, PA, USA
Venue:
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Year:
2010

Citing 0
Cited 3

Report on the third workshop on exploiting semantic annotations in information retrieval (ESAIR)

ACM SIGIR Forum
Ranking-based processing of SQL queries

Proceedings of the 20th ACM international conference on Information and knowledge management
A schema-driven approach for knowledge-oriented retrieval and query formulation

KEYS '12 Proceedings of the Third International Workshop on Keyword Search on Structured Data

Quantified Score

Hi-index	0.00

Visualization

Abstract

Question-answering, computer-assisted language learning, text mining, and other software applications that use a full-search engine to find information in a large text corpus are becoming common. A software application may use metadata and text annotations to reduce the mismatch between the concept-based representations convenient for inference and the word-based representations typically used for text retrieval. Software applications may also be able to specify detailed requirements that retrieved passages must satisfy. This use of text search is very different than the ad-hoc, interactive search that information retrieval research typically studies. Search engine developers are beginning to respond by extending indexing and retrieval models developed for structured (e.g., XML) documents to support multiple representations of document content, text annotations, metadata, and relationships. These new requirements force developers to reconsider basic assumptions about index data structures and ranked retrieval models. How best to use these new capabilities is an open problem. Straightforward transformation of a detailed information need into a complex structured query can produce a query that is effective for exact-match retrieval, but a challenge for the retrieval model to use effectively for best-match retrieval. Bag-of-words retrieval is often disparaged, but its advantage is that it is robust: It works well even when desired documents do not exactly meet expectations. This talk discusses some of the problems encountered when extending a search engine to support queries posed by other software applications and structured documents with derived annotations