A statistical approach to machine translation
Computational Linguistics
Stochastic inversion transduction grammars and bilingual parsing of parallel corpora
Computational Linguistics
Bitext maps and alignment via pattern recognition
Computational Linguistics
An algorithm for simultaneously bracketing parallel texts by aligning words
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
K-vec: a new approach for aligning parallel texts
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
Transformation-based learning in the fast lane
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Inducing multilingual POS taggers and NP bracketers via robust projection across aligned corpora
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Minimally supervised morphological analysis by multimodal alignment
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Improved statistical alignment models
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Cross-Language Access to Recorded Speech in the MALACH Project
TSD '02 Proceedings of the 5th International Conference on Text, Speech and Dialogue
A systematic comparison of various statistical alignment models
Computational Linguistics
Automatic construction of English/Chinese parallel corpora
Journal of the American Society for Information Science and Technology
Computational Linguistics - Special issue on web as corpus
Lexical triggers and latent semantic analysis for cross-lingual language model adaptation
ACM Transactions on Asian Language Information Processing (TALIP)
Inducing information extraction systems for new languages via cross-language projection
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Crosslinguistic transfer in automatic verb classification
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
A multilingual paradigm for automatic verb classification
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Inducing multilingual POS taggers and NP bracketers via robust projection across aligned corpora
NAACL '01 Proceedings of the second meeting of the North American Chapter of the Association for Computational Linguistics on Language technologies
Language model based arabic word segmentation
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Natural Language Engineering
Bootstrapping parsers via syntactic projection across parallel texts
Natural Language Engineering
Optimization of word alignment clues
Natural Language Engineering
Improving Machine Translation Performance by Exploiting Non-Parallel Corpora
Computational Linguistics
Phrasal cohesion and statistical machine translation
EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Bootstrapping a multilingual part-of-speech tagger in one person-day
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Using 'smart' bilingual projection to feature-tag a monolingual dictionary
CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
A projection extension algorithm for statistical machine translation
EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Cross-lingual lexical triggers in statistical language modeling
EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Unsupervised models for morpheme segmentation and morphology learning
ACM Transactions on Speech and Language Processing (TSLP)
Aligning words using matrix factorisation
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Improving bitext word alignments via syntax-based reordering of English
ACLdemo '04 Proceedings of the ACL 2004 on Interactive poster and demonstration sessions
A framework for unsupervised natural language morphology induction
ACLstudent '04 Proceedings of the ACL 2004 workshop on Student research
Optimal constituent alignment with edge covers for semantic projection
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Evaluating cross-language annotation transfer in the MultiSemCor corpus
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
NeurAlign: combining word alignments using neural networks
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Alignment link projection using transformation-based learning
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Cross-linguistic projection of role-semantic information
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
A FrameNet-based semantic role labeler for Swedish
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
Ripple Down Rule learning for automated word lemmatisation
AI Communications
Statistical machine translation
ACM Computing Surveys (CSUR)
The bootstrapping of the Yarowsky algorithm in real corpora
Information Processing and Management: an International Journal
Data-driven dependency parsing of new languages using incomplete and noisy training data
CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Semantically rich human-aided machine annotation
CorpusAnno '05 Proceedings of the Workshop on Frontiers in Corpus Annotations II: Pie in the Sky
Tagging Portuguese with a Spanish tagger using cognates
CrossLangInduction '06 Proceedings of the International Workshop on Cross-Language Knowledge Induction
Projecting POS tags and syntactic dependencies from English and French to Polish in aligned corpora
CrossLangInduction '06 Proceedings of the International Workshop on Cross-Language Knowledge Induction
Mention detection crossing the language barrier
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Unsupervised multilingual learning for POS tagging
EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Morphological analysis for statistical machine translation
HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers
Combination of statistical word alignments based on multiple preprocessing schemes
NAACL-Short '07 Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers
Cross-lingual bootstrapping of semantic lexicons: the case of FrameNet
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
Cross-lingual propagation for morphological analysis
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Unsupervised induction of natural language morphology inflection classes
SIGMorPhon '04 Proceedings of the 7th Meeting of the ACL Special Interest Group in Computational Phonology: Current Themes in Computational Phonology and Morphology
Multilingual noise-robust supervised morphological analysis using the WordFrame model
SIGMorPhon '04 Proceedings of the 7th Meeting of the ACL Special Interest Group in Computational Phonology: Current Themes in Computational Phonology and Morphology
Cross-Language Information Propagation for Arabic Mention Detection
ACM Transactions on Asian Language Information Processing (TALIP)
ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts
A hybrid approach to align sentences and words in English-Hindi parallel corpora
ParaText '05 Proceedings of the ACL Workshop on Building and Using Parallel Texts
Dependency grammar induction via bitext projection constraints
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
An extensible crosslinguistic readability framework
BUCC '09 Proceedings of the 2nd Workshop on Building and Using Comparable Corpora: from Parallel to Non-parallel Corpora
Exploiting translational correspondences for pattern-independent MWE identification
MWE '09 Proceedings of the Workshop on Multiword Expressions: Identification, Interpretation, Disambiguation and Applications
Unsupervised morphological segmentation and clustering with document boundaries
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 2 - Volume 2
Cross-lingual annotation projection of semantic roles
Journal of Artificial Intelligence Research
Multilingual part-of-speech tagging: two unsupervised approaches
Journal of Artificial Intelligence Research
Translation by iterative collaboration between monolingual users
Proceedings of Graphics Interface 2010
Finding cognate groups using phylogenies
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
A statistical model for lost language decipherment
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
A cross-lingual induction technique for German adverbial participles
NLPLING '10 Proceedings of the 2010 Workshop on NLP and Linguistics: Finding the Common Ground
Learning better monolingual models with unannotated bilingual text
CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
Improving translation via targeted paraphrasing
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Enhancing mention detection using projection via aligned corpora
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
A cross-lingual annotation projection approach for relation detection
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Cross-lingual induction for deep broad-coverage syntax: a case study on German participles
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Covariance in Unsupervised Learning of Probabilistic Grammars
The Journal of Machine Learning Research
Partial parsing from bitext projections
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Scaling up automatic cross-lingual semantic role annotation
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Expanding a multilingual media monitoring and information extraction tool to a new language: Swahili
Language Resources and Evaluation
Unsupervised multilingual learning
Unsupervised multilingual learning
Improving statistical word alignments with morpho-syntactic transformations
FinTAL'06 Proceedings of the 5th international conference on Advances in Natural Language Processing
A low-budget tagger for Old Czech
LaTeCH '11 Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities
Experiments in cross-language morphological annotation transfer
CICLing'06 Proceedings of the 7th international conference on Computational Linguistics and Intelligent Text Processing
WMT '11 Proceedings of the Sixth Workshop on Statistical Machine Translation
Universal morphological analysis using structured nearest neighbor prediction
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A correction model for word alignments
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
Cross-Lingual alignment of framenet annotations through hidden markov models
CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
A survey of methods to ease the development of highly multilingual text mining applications
Language Resources and Evaluation
Nudging the envelope of direct transfer methods for multilingual named entity recognition
WILS '12 Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure
Multilingual named entity recognition using parallel data and metadata from Wikipedia
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
A graph-based cross-lingual projection approach for weakly supervised relation extraction
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Universal grapheme-to-phoneme prediction over Latin alphabets
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Part-of-speech tagging for Chinese-English mixed texts with dynamic features
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Accurate unsupervised joint named-entity extraction from unaligned parallel text
NEWS '12 Proceedings of the 4th Named Entity Workshop
Learning multilingual named entity recognition from Wikipedia
Artificial Intelligence
Using targeted paraphrasing and monolingual crowdsourcing to improve translation
ACM Transactions on Intelligent Systems and Technology (TIST) - Special Sections on Paraphrasing; Intelligent Systems for Socially Aware Computing; Social Computing, Behavioral-Cultural Modeling, and Prediction
Cross-Lingual Annotation Projection for Weakly-Supervised Relation Extraction
ACM Transactions on Asian Language Information Processing (TALIP)
Hi-index | 0.00 |
This paper describes a system and set of algorithms for automatically inducing stand-alone monolingual part-of-speech taggers, base noun-phrase bracketers, named-entity taggers and morphological analyzers for an arbitrary foreign language. Case studies include French, Chinese, Czech and Spanish.Existing text analysis tools for English are applied to bilingual text corpora and their output projected onto the second language via statistically derived word alignments. Simple direct annotation projection is quite noisy, however, even with optimal alignments. Thus this paper presents noise-robust tagger, bracketer and lemmatizer training procedures capable of accurate system bootstrapping from noisy and incomplete initial projections.Performance of the induced stand-alone part-of-speech tagger applied to French achieves 96% core part-of-speech (POS) tag accuracy, and the corresponding induced noun-phrase bracketer exceeds 91% F-measure. The induced morphological analyzer achieves over 99% lemmatization accuracy on the complete French verbal system.This achievement is particularly noteworthy in that it required absolutely no hand-annotated training data in the given language, and virtually no language-specific knowledge or resources beyond raw text. Performance also significantly exceeds that obtained by direct annotation projection.