Cross-language spoken document retrieval using HMM-based retrieval model with multi-scale fusion

Authors:
Wai-Kit Lo;Helen Meng;P. C. Ching
Affiliations:
The Chinese University of Hong Kong;The Chinese University of Hong Kong;The Chinese University of Hong Kong
Venue:
ACM Transactions on Asian Language Information Processing (TALIP)
Year:
2003

Citing 24
Cited 3

Automatic combination of multiple ranked retrieval systems

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Combining the evidence of multiple query representations for information retrieval

TREC-2 Proceedings of the second conference on Text retrieval conference
Querying across languages: a dictionary-based approach to multilingual information retrieval

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Experiments in multilingual information retrieval using the SPIDER system

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Comparing representations in Chinese information retrieval

Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Phrasal translation and query expansion techniques for cross-language information retrieval

Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Cross-language speech retrieval: establishing a baseline performance

Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Cross-language information retrieval based on parallel texts and automatic mining of parallel texts from the Web

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Information retrieval as statistical translation

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A general language model for information retrieval (poster abstract)

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A general language model for information retrieval

Proceedings of the eighth international conference on Information and knowledge management
Effects of out of vocabulary words in spoken document retrieval (poster session)

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Experiments in syllable-based retrieval of broadcast news speech in Mandarin Chinese

Speech Communication - Special issue on accessing information in spoken audio
Subword-based approaches for spoken document retrieval

Speech Communication
Improving query translation for cross-language information retrieval using statistical models

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
A study of smoothing methods for language models applied to Ad Hoc information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Cross-Language Information Retrieval

Cross-Language Information Retrieval
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Fusion Via a Linear Combination of Scores

Information Retrieval
Dictionary-Based Cross-Language Information Retrieval: Problems, Methods, and Research Findings

Information Retrieval
Cross-Language Information Retrieval in a Multilingual Legal Domain

ECDL '97 Proceedings of the First European Conference on Research and Advanced Technology for Digital Libraries
Comparison of Word and Subword Indexing Techniques for Mandarin Chinese Spoken Document Retrieval

PCM '01 Proceedings of the Second IEEE Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
Improved cross-language retrieval using backoff translation

HLT '01 Proceedings of the first international conference on Human language technology research
Mandarin-English Information (MEI): investigating translingual speech retrieval

HLT '01 Proceedings of the first international conference on Human language technology research

Exploring the use of latent topical information for statistical Chinese spoken document retrieval

Pattern Recognition Letters
Cross-lingual audio-to-text alignment for multimedia content management

Decision Support Systems
Spoken Content Retrieval: A Survey of Techniques and Technologies

Foundations and Trends in Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Cross-language spoken document retrieval (CL-SDR) is the technology that facilitates automatic retrieval of relevant information from a collection of spoken documents in a language that is different from that used in the queries. Information sources that are in different languages can then be retrieved automatically with CL-SDR, and the number of searchable information sources will increase significantly. The HMM-based retrieval model is a probabilistic formulation for the retrieval problem. Extensions to this retrieval model can be made by taking advantage of its probabilistic nature. Specifically, we have incorporated the translation component to make it possible to perform cross-language information retrieval (CLIR). In addition, this HMM-based CLIR retrieval model is also extended for retrieval at subword scales.In this work the extended HMM-based retrieval model has been applied to an English-Mandarin CL-SDR task, which is to search the Mandarin spoken document collection with English queries at word and subword scales. Retrieval results obtained from these indexing scales are then fused for multi-scale CL-SDR. Experimental results demonstrate that improvement in CL-SDR retrieval performance can be achieved by fusion of word and subword scales.