Cognitive canonicalization of natural language queries using semantic strata

Authors:
Suman Deb Roy;Wenjun Zeng
Affiliations:
University of Missouri;University of Missouri
Venue:
ACM Transactions on Speech and Language Processing (TSLP)
Year:
2014

Citing 31
Cited 0

Depth-first iterative-deepening: an optimal admissible tree search

Artificial Intelligence
Formal semantics and pragmatics for natural language querying

Formal semantics and pragmatics for natural language querying
Foundations of statistical natural language processing

Foundations of statistical natural language processing
Probabilistic query expansion using query logs

Proceedings of the 11th international conference on World Wide Web
Towards a theory of natural language interfaces to databases

Proceedings of the 8th international conference on Intelligent user interfaces
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Maximum Entropy Markov Models for Information Extraction and Segmentation

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Word-sense disambiguation using decomposable models

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Lucene in Action (In Action series)

Lucene in Action (In Action series)
Semantic coherence scoring using an ontology

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Shallow parsing with conditional random fields

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
The importance of syntactic parsing and inference in semantic role labeling

Computational Linguistics
Understanding user's query intent with wikipedia

Proceedings of the 18th international conference on World wide web
An ontology-driven approach for semantic information retrieval on the Web

ACM Transactions on Internet Technology (TOIT)
Named entity recognition in query

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Joint parsing and named entity recognition

NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Topic identification using Wikipedia graph centrality

NAACL-Short '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers
Computing semantic relatedness using Wikipedia-based explicit semantic analysis

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Comparison of Tree-Child Phylogenetic Networks

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Identifying Patterns in Texts

ICSC '09 Proceedings of the 2009 IEEE International Conference on Semantic Computing
Query Sentences as Semantic (Sub) Networks

ICSC '09 Proceedings of the 2009 IEEE International Conference on Semantic Computing
An Experimental Study of Graph Connectivity for Unsupervised Word Sense Disambiguation

IEEE Transactions on Pattern Analysis and Machine Intelligence
DBpedia: a nucleus for a web of open data

ISWC'07/ASWC'07 Proceedings of the 6th international The semantic web and 2nd Asian conference on Asian semantic web conference
Generalized syntactic and semantic models of query reformulation

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Towards open-domain Semantic Role Labeling

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Event extraction as dependency parsing

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Semantic Web for the Working Ontologist: Effective Modeling in RDFS and OWL

Semantic Web for the Working Ontologist: Effective Modeling in RDFS and OWL
A Survey of Automatic Query Expansion in Information Retrieval

ACM Computing Surveys (CSUR)
Natural Language Processing (Almost) from Scratch

The Journal of Machine Learning Research
Generating suggestions for queries in the long tail with an inverted index

Information Processing and Management: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Natural language search relies strongly on perceiving semantics in a query sentence. Semantics is captured by the relationship among the query words, represented as a network (graph). Such a network of words can be fed into larger ontologies, like DBpedia or Google Knowledge Graph, where they appear as subgraphs— fashioning the name subnetworks (subnets). Thus, subnet is a canonical form for interfacing a natural language query to a graph database and is an integral step for graph-based searching. In this article, we present a novel standalone NLP technique that leverages the cognitive psychology notion of semantic strata for semantic subnetwork extraction from natural language queries. The cognitive model describes some of the fundamental structures employed by the human cognition to construct semantic information in the brain, called semantic strata. We propose a computational model based on conditional random fields to capture the cognitive abstraction provided by semantic strata, facilitating cognitive canonicalization of the query. Our results, conducted on approximately 5000 queries, suggest that the cognitive canonicals based on semantic strata are capable of significantly improving parsing and role labeling performance beyond pure lexical approaches, such as parts-of-speech based techniques. We also find that cognitive canonicalized subnets are more semantically coherent compared to syntax trees when explored in graph ontologies like DBpedia and improve ranking of retrieved documents.