Cross-lingual CSTRD: English access to Hindi information

Authors:
Anton Leuski;Chin-Yew Lin;Liang Zhou;Ulrich Germann;Franz Josef Och;Eduard Hovy
Affiliations:
Information Sciences Institute, University of Southern California;Information Sciences Institute, University of Southern California;Information Sciences Institute, University of Southern California;Information Sciences Institute, University of Southern California;Information Sciences Institute, University of Southern California;Information Sciences Institute, University of Southern California
Venue:
ACM Transactions on Asian Language Information Processing (TALIP)
Year:
2003

Citing 30
Cited 3

Recent trends in hierarchic document clustering: a critical review

Information Processing and Management: an International Journal
Automatic text processing

Automatic text processing
Scatter/Gather: a cluster-based approach to browsing large document collections

SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Graph drawing by force-directed placement

Software—Practice & Experience
Constant interaction-time scatter/gather browsing of very large document collections

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Optimization of relevance feedback weights

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
A case for interaction: a study of interactive information retrieval behavior and effectiveness

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Reexamining the cluster hypothesis: scatter/gather on retrieval results

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
The effects of query structure and dictionary setups in dictionary-based cross-language information retrieval

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Summarizing text documents: sentence selection and evaluation metrics

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
New Methods in Automatic Extracting

Journal of the ACM (JACM)
Relevance and reinforcement in interactive browsing

Proceedings of the ninth international conference on Information and knowledge management
Evaluating combinations of ranked lists and visualizations of inter-document similarity

Information Processing and Management: an International Journal - Special issue on interactivity at the text retrieval conference (TREC)
Evaluating document clustering for interactive information retrieval

Proceedings of the tenth international conference on Information and knowledge management
Information Retrieval

Information Retrieval
Interactive information organization: techniques and evaluation

Interactive information organization: techniques and evaluation
Accurate methods for the statistics of surprise and coincidence

Computational Linguistics - Special issue on using large corpora: I
The mathematics of statistical machine translation: parameter estimation

Computational Linguistics - Special issue on using large corpora: II
Nymble: a high-performance learning name-finder

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Identifying topics by position

ANLC '97 Proceedings of the fifth conference on Applied natural language processing
The automated acquisition of topic signatures for text summarization

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Interactive Information Retrieval Using Clustering and Spatial Proximity

User Modeling and User-Adapted Interaction
Discriminative training and maximum entropy models for statistical machine translation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
From single to multi-document summarization: a prototype system and its evaluation

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Greedy decoding for statistical machine translation in almost linear time

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Automatic evaluation of summaries using N-gram co-occurrence statistics

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Minimum error rate training in statistical machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
iNeATS: interactive multi-document summarization

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 2
Building a statistical machine translation system from scratch: how much bang for the buck can we expect?

DMMT '01 Proceedings of the workshop on Data-driven methods in machine translation - Volume 14
Generation of word graphs in statistical machine translation

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10

Cross-language document summarization based on machine translation quality prediction

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Using bilingual information for cross-language document summarization

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Summarizing the differences in multilingual news

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present C*ST*RD, a cross-language information delivery system that supports cross-language information retrieval, information space visualization and navigation, machine translation, and text summarization of single documents and clusters of documents. C*ST*RD was assembled and trained within 1 month, in the context of DARPA's Surprise Language Exercise, that selected as source a heretofore unstudied language, Hindi. Given the brief time, we could not create deep Hindi capabilities for all the modules, but instead experimented with combining shallow Hindi capabilities, or even English-only modules, into one integrated system. Various possible configurations, with different tradeoffs in processing speed and ease of use, enable the rapid deployment of C*ST*RD to new languages under various conditions.

Cross-lingual C*ST*RD: English access to Hindi information

Quantified Score

Visualization

Abstract

Cross-lingual CSTRD: English access to Hindi information