FAQ mining via list detection

Authors:
Yu-Sheng Lai;Kuao-Ann Fung;Chung-Hsien Wu
Affiliations:
National Cheng Kung University, Taiwan, R.O.C.;National Cheng Kung University, Taiwan, R.O.C.;National Cheng Kung University, Taiwan, R.O.C.
Venue:
MultiSumQA '02 proceedings of the 2002 conference on multilingual summarization and question answering - Volume 19
Year:
2002

Citing 6
Cited 9

The SGML handbook

The SGML handbook
Efficient crawling through URL ordering

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Modern Information Retrieval

Modern Information Retrieval
Introduction to Automata Theory, Languages and Computability

Introduction to Automata Theory, Languages and Computability
IE5 Dynamic HTML Programmer's Reference

IE5 Dynamic HTML Programmer's Reference
Defining the Web: The Politics of Search Engines

Computer

DIGIMIMIR: A Tool for Rapid Situation Analysis of Helpdesk and Support Email

LISA '04 Proceedings of the 18th USENIX conference on System administration
Retrieving answers from frequently asked questions pages on the web

Proceedings of the 14th ACM international conference on Information and knowledge management
Finding similar questions in large question and answer archives

Proceedings of the 14th ACM international conference on Information and knowledge management
Word selection for EBMT based on monolingual similarity and translation confidence

HLT-NAACL-PARALLEL '03 Proceedings of the HLT-NAACL 2003 Workshop on Building and using parallel texts: data driven machine translation and beyond - Volume 3
Recommending questions using the mdl-based tree cut model

Proceedings of the 17th international conference on World Wide Web
A syntactic tree matching approach to finding similar questions in community-based qa services

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Question pre-processing in a QA system on Internet discussion groups

SumQA '06 Proceedings of the Workshop on Task-Focused Summarization and Question Answering
Confucius and its intelligent disciples: integrating social with search

Proceedings of the VLDB Endowment
Question answering system with recommendation using fuzzy relational product operator

Proceedings of the 12th International Conference on Information Integration and Web-based Applications & Services

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents an approach to FAQ mining via a list detection algorithm. List detection is very important for data collection since list has been widely used for representing data and information on the Web. By analyzing the rendering of FAQs on the Web, we found a fact that all FAQs are always fully/partially represented in a list-like form. There are two ways to author a list on the Web. One is to use some specific tags, e.g. tag for HTML. The lists authored in this way can be easily detected by parsing those special tags. Another way uses other tags instead of the special tags. Unfortunately, many lists are authored in the second way. To detect lists, therefore, we present an algorithm, which is independent of Web languages. By combining the algorithm with some domain knowledge, we detect and collect FAQs from the Web. The mining task achieved a performance of 72.54% recall and 80.16% precision rates.