Semantic Annotation of City Transportation Information Dialogues Using CRF Method

Authors:
Agnieszka Mykowiecka;Jakub Waszczuk
Affiliations:
Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland 01-237 and Polish-Japanese Institute of Information Techniques, Warsaw, Poland 02-008;Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland 01-237
Venue:
TSD '09 Proceedings of the 12th International Conference on Text, Speech and Dialogue
Year:
2009

Citing 6
Cited 3

Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization

ACM Transactions on Mathematical Software (TOMS)
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Shallow parsing with conditional random fields

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Automatic Semantic Annotation of Polish Dialogue Corpus

TSD '08 Proceedings of the 11th international conference on Text, Speech and Dialogue
Biomedical named entity recognition using conditional random fields and rich feature sets

JNLPBA '04 Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications
A hybrid Markov/semi-Markov conditional random field for sequence segmentation

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing

Time expressions ontology for information seeking dialogues in the public transport domain

IceTAL'10 Proceedings of the 7th international conference on Advances in natural language processing
Automatic semantic labeling of medical texts with feature structures

TSD'11 Proceedings of the 14th international conference on Text, speech and dialogue
Optimizing CRF-Based model for proper name recognition in polish texts

CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

The article presents results of an experiment consisting in automatic concept annotation of the transliterated spontaneous human-human dialogues in the city transportation domain. The data source was a corpus of dialogues collected at a Warsaw call center and annotated with about 200 concepts' types. The machine learning technique we used is the linear-chain Conditional Random Fields (CRF) sequence labeling approach. The model based on word lemmas in a window of length 5 gave results of concept recognition with an F-measure equal to 0.85.