Assessing agreement on classification tasks: the kappa statistic
Computational Linguistics
Play it again: a study of the factors underlying speech browsing behavior
CHI 98 Cconference Summary on Human Factors in Computing Systems
Efficient construction of large test collections
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Towards the identification of the optimal number of relevance categories
Journal of the American Society for Information Science
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Evaluating evaluation measure stability
SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Variations in relevance judgments and the measurement of retrieval effectiveness
Information Processing and Management: an International Journal
Supporting access to large digital oral history archives
Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries
Corpora for topic detection and tracking
Topic detection and tracking
Topic segmentation of dialogue
ACTS '09 Proceedings of the HLT-NAACL 2006 Workshop on Analyzing Conversations in Text and Speech
Museli: a multi-source evidence integration approach to topic segmentation of spontaneous dialogue
NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Investigating cross-language speech retrieval for a spontaneous conversational speech collection
NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Exploring fusion in a spontaneous speech retrieval task
SSCS '09 Proceedings of the third workshop on Searching spontaneous conversational speech
Information retrieval test collection for searching spontaneous Czech speech
TSD'07 Proceedings of the 10th international conference on Text, speech and dialogue
IEEE Transactions on Audio, Speech, and Language Processing
Overview of the CLEF-2005 cross-language speech retrieval track
CLEF'05 Proceedings of the 6th international conference on Cross-Language Evalution Forum: accessing Multilingual Information Repositories
Using various indexing schemes and multiple translations in the CL-SR task at CLEF 2005
CLEF'05 Proceedings of the 6th international conference on Cross-Language Evalution Forum: accessing Multilingual Information Repositories
Pitt at CLEF05: data fusion for spoken document retrieval
CLEF'05 Proceedings of the 6th international conference on Cross-Language Evalution Forum: accessing Multilingual Information Repositories
CLEF-2005 CL-SR at maryland: document and query expansion using side collections and thesauri
CLEF'05 Proceedings of the 6th international conference on Cross-Language Evalution Forum: accessing Multilingual Information Repositories
Matching meaning for cross-language information retrieval
Information Processing and Management: an International Journal
Spoken Content Retrieval: A Survey of Techniques and Technologies
Foundations and Trends in Information Retrieval
Experiments for the cross language speech retrieval task at CLEF 2006
CLEF'06 Proceedings of the 7th international conference on Cross-Language Evaluation Forum: evaluation of multilingual and multi-modal information retrieval
CLEF-2006 CL-SR at Maryland: English and Czech
CLEF'06 Proceedings of the 7th international conference on Cross-Language Evaluation Forum: evaluation of multilingual and multi-modal information retrieval
Hi-index | 0.00 |
Test collections model use cases in ways that facilitate evaluation of information retrieval systems. This paper describes the use of search-guided relevance assessment to create a test collection for retrieval of spontaneous conversational speech. Approximately 10,000 thematically coherent segments were manually identified in 625 hours of oral history interviews with 246 individuals. Automatic speech recognition results, manually prepared summaries, controlled vocabulary indexing, and name authority control are available for every segment. Those features were leveraged by a team of four relevance assessors to identify topically relevant segments for 28 topics developed from actual user requests. Search-guided assessment yielded sufficient inter-annotator agreement to support formative evaluation during system development. Baseline results for ranked retrieval are presented to illustrate use of the collection.