QuASM: a system for question answering using semi-structured data

Authors:
David Pinto;Michael Branstein;Ryan Coleman;W. Bruce Croft;Matthew King;Wei Li;Xing Wei
Affiliations:
University of Massachusetts, Amherst, MA;University of Massachusetts, Amherst, MA;University of Massachusetts, Amherst, MA;University of Massachusetts, Amherst, MA;University of Massachusetts, Amherst, MA;University of Massachusetts, Amherst, MA;University of Massachusetts, Amherst, MA
Venue:
Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries
Year:
2002

Citing 6
Cited 27

TINTIN: a system for retrieval in text tables

DL '97 Proceedings of the second ACM international conference on Digital libraries
A language modeling approach to information retrieval

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
A general language model for information retrieval (poster abstract)

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Bayes optimal metasearch: a probabilistic model for combining the results of multiple retrieval systems (poster session)

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Nymble: a high-performance learning name-finder

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Evaluating question-answering techniques in Chinese

HLT '01 Proceedings of the first international conference on Human language technology research

Table extraction using conditional random fields

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Constraint-based wrapper specification and verification for cooperative information systems

Information Systems - Special issue: Data quality in cooperative information systems
Table extraction using conditional random fields

dg.o '03 Proceedings of the 2003 annual national conference on Digital government research
Collaborative research - digital government: a language modeling approach to metadata for cross-database linkage and search

dg.o '04 Proceedings of the 2004 annual national conference on Digital government research
Question answering performance on table data

dg.o '04 Proceedings of the 2004 annual national conference on Digital government research
Learning question classifiers: the role of semantic information

Natural Language Engineering
Learning table extraction from examples

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Transforming arbitrary tables into logical form with TARTAR

Data & Knowledge Engineering
Tamil Question Classification Using Morpheme Features

GoTAL '08 Proceedings of the 6th international conference on Advances in Natural Language Processing
Combining content extraction heuristics: the CombinE system

Proceedings of the 10th International Conference on Information Integration and Web-based Applications & Services
Estimating web site readability using content extraction

Proceedings of the 18th international conference on World wide web
Question classification using head words and their hypernyms

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Bridging the gap: from multi document Template Detection to single document Content Extraction

EuroIMSA '08 Proceedings of the IASTED International Conference on Internet and Multimedia Systems and Applications
From tables to frames

Web Semantics: Science, Services and Agents on the World Wide Web
CETR: content extraction via tag ratios

Proceedings of the 19th international conference on World wide web
Multiple-taxonomy question classification for category search on faceted information

TSD'07 Proceedings of the 10th international conference on Text, speech and dialogue
A semantic approach for question classification using WordNet and Wikipedia

Pattern Recognition Letters
Mining for attributes and values in tables

Proceedings of the International Conference on Management of Emergent Digital EcoSystems
DOM based content extraction via text density

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
A target oriented agent to collect specific information in a chat medium

ISCIS'06 Proceedings of the 21st international conference on Computer and Information Sciences
Hybrid model of content extraction

Journal of Computer and System Sciences
Dialogue: driven information retrieval

FDIA'09 Proceedings of the Third BCS-IRSG conference on Future Directions in Information Access
An architecture-centered framework for developing blog crawlers

Proceedings of the 27th Annual ACM Symposium on Applied Computing
RetriBlog: a framework for creating blog crawlers

Proceedings of the 27th Annual ACM Symposium on Applied Computing
RetriBlog: An architecture-centered framework for developing blog crawlers

Expert Systems with Applications: An International Journal
Automatic Extraction of Blog Post from Diverse Blog Pages

WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
Structured positional entity language model for enterprise entity retrieval

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes a system for question answering using semi-structured metadata, QuASM (pronounced "chasm"). Question answering systems aim to improve search performance by providing users with specific answers, rather than having users scan retrieved documents for these answers. Our goal is to answer factual questions by exploiting the structure inherent in documents found on the World Wide Web (WWW). Based on this structure, documents are indexed into smaller units and associated with metadata. Transforming table cells into smaller units associated with metadata is an important part of this task. In addition, we report on work to improve question classification using language models. The domain used to develop this system is documents retrieved from a crawl of www.fedstats.gov.