A generic platform for addressing the multimodal challenge
CHI '95 Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Integration and synchronization of input modes during multimodal human-computer interaction
Proceedings of the ACM SIGCHI Conference on Human factors in computing systems
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
The Journal of Machine Learning Research
A probabilistic approach to reference resolution in multimodal user interfaces
Proceedings of the 9th international conference on Intelligent user interfaces
Unification-based multimodal integration
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Unification-based multimodal parsing
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Finite-state multimodal parsing and understanding
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Salience modeling based on non-verbal modalities for spoken language understanding
Proceedings of the 8th international conference on Multimodal interfaces
Latent Semantic Mapping: Principles And Applications (Synthesis Lectures on Speech and Audio Processing)
Optimization in multimodal interpretation
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
HLT '02 Proceedings of the second international conference on Human Language Technology Research
A Novel Document Clustering Model Based on Latent Semantic Analysis
SKG '07 Proceedings of the Third International Conference on Semantics, Knowledge and Grid
Automatic generic document summarization based on non-negative matrix factorization
Information Processing and Management: an International Journal
Word Topic Models for Spoken Document Retrieval and Transcription
ACM Transactions on Asian Language Information Processing (TALIP)
IEEE Transactions on Audio, Speech, and Language Processing - Special issue on multimodal processing in speech-based interactions
Probabilistic latent semantic analysis
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Multimodal integration-a statistical view
IEEE Transactions on Multimedia
Goal detection from natural language queries
NLDB'10 Proceedings of the Natural language processing and information systems, and 15th international conference on Applications of natural language to information systems
Hi-index | 0.00 |
This paper describes our work in usage pattern analysis and development of a latent semantic analysis framework for interpreting multimodal user input consisting speech and pen gestures. We have designed and collected a multimodal corpus of navigational inquiries. Each modality carries semantics related to domain-specific task goal. Each inquiry is annotated manually with a task goal based on the semantics. Multimodal input usually has a simpler syntactic structure than unimodal input and the order of semantic constituents is different in multimodal and unimodal inputs. Therefore, we proposed to use semantic analysis to derive the latent semantics from the multimodal inputs using latent semantic modeling (LSM). In order to achieve this, we parse the recognized Chinese spoken input for the spoken locative references (SLR). These SLRs are then aligned with their corresponding pen gesture(s). Then, we characterized the cross-modal integration pattern as 3-tuple multimodal terms with SLR, pen gesture type and their temporal relation. The inquiry-multimodal term matrix is then decomposed using singular value decomposition (SVD) to derive the latent semantics automatically. Task goal inference based on the latent semantics shows that the task goal inference accuracy on a disjoint test set is of 99%.