Scatter/Gather: a cluster-based approach to browsing large document collections
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
An Introduction to Variational Methods for Graphical Models
Machine Learning
Learning Approaches for Detecting and Tracking News Events
IEEE Intelligent Systems
Explorations within topic tracking and detection
Topic detection and tracking
Retrieval and novelty detection at the sentence level
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Implementing Agglomerative Hierarchic Clustering Algorithms for Use in Document Retrieval
Implementing Agglomerative Hierarchic Clustering Algorithms for Use in Document Retrieval
The Journal of Machine Learning Research
Geographical information recognition and visualization in texts written in various languages
Proceedings of the 2004 ACM symposium on Applied computing
Text classification and named entities for new event detection
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Language-specific models in multilingual topic tracking
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
A probabilistic model for retrospective news event detection
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
BiTAM: bilingual topic AdMixture models for word alignment
COLING-ACL '06 Proceedings of the COLING/ACL on Main conference poster sessions
New event detection based on indexing-tree and named entity
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Cross-language information retrieval using PARAFAC2
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Bilingual topic aspect classification with a few training examples
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Mining multilingual topics from wikipedia
Proceedings of the 18th international conference on World wide web
Probabilistic latent semantic analysis
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
A cocktail approach to the VideoCLEF'09 linking task
CLEF'09 Proceedings of the 10th international conference on Cross-language evaluation forum: multimedia experiments
Identifying word translations from comparable corpora using latent topic models
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Knowledge transfer across multilingual corpora via latent topics
PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part I
Semantic relations in bilingual lexicons
ACM Transactions on Speech and Language Processing (TSLP)
Combining wikipedia-based concept models for cross-language retrieval
IRFC'10 Proceedings of the First international Information Retrieval Facility conference on Adbances in Multidisciplinary Retrieval
Cross-language information retrieval with latent topic models trained on a comparable corpus
AIRS'11 Proceedings of the 7th Asia conference on Information Retrieval Technology
Detecting highly confident word translations from comparable corpora without any prior knowledge
EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Proceedings of the 2013 international workshop on Mining unstructured big data using natural language processing
Hi-index | 0.00 |
We have studied the problem of linking event information across different languages without the use of translation systems or dictionaries. The linking is based on interlingua information obtained through probabilistic topic models trained on comparable corpora written in two languages (in our case English and Dutch). The achieve this, we expand the Latent Dirichlet Allocation model to process documents in two languages. We demonstrate the validity of the learned interlingual topics in a document clustering task, where the evaluation is performed on Google News.