Machine learning of syntactic parse trees for search and classification of text

Authors:
Boris Galitsky
Affiliations:
eBay Inc. 2145, Hamilton Avenue, San Jose CA, USA
Venue:
Engineering Applications of Artificial Intelligence
Year:
2013

Citing 29
Cited 2

Support-Vector Networks

Machine Learning
A Machine-Oriented Logic Based on the Resolution Principle

Journal of the ACM (JACM)
DIRT @SBT@discovery of inference rules from text

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Natural Language Information Retrieval

Natural Language Information Retrieval
Pattern Structures and Their Projections

ICCS '01 Proceedings of the 9th International Conference on Conceptual Structures: Broadening the Base
The TREC question answering track

Natural Language Engineering
Learning surface text patterns for a Question Answering system

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
COGEX: a logic prover for question answering

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Loosely tree-based alignment for machine translation

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Towards light semantic processing for Question Answering

HLT-NAACL-TEXTMEANING '03 Proceedings of the HLT-NAACL 2003 workshop on Text meaning - Volume 9
A semantic approach to IE pattern induction

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Corpus-based semantic role approach in information retrieval

Data & Knowledge Engineering
Measuring Semantic Similarity between Named Entities by Searching the Web Directory

WI '07 Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence
Exploring syntactic structured features over parse trees for relation extraction using kernel methods

Information Processing and Management: an International Journal
What do they think?: aggregating local views about news events and topics

Proceedings of the 17th international conference on World Wide Web
Extracting Semantic Networks from Text Via Relational Clustering

ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
Learning communicative actions of conflicting human agents

Journal of Experimental & Theoretical Artificial Intelligence
Kernel methods, syntax and semantics for relational text categorization

Proceedings of the 17th ACM conference on Information and knowledge management
A Relation-Based Page Rank Algorithm for Semantic Web Search Engines

IEEE Transactions on Knowledge and Data Engineering
A novel approach for classifying customer complaints through graphs similarities in argumentative dialogues

Decision Support Systems
Syntactic and semantic kernels for short text pair categorization

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Semantic inference at the lexical-syntactic level

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Open information extraction from the web

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Generic parsing for multi-domain semantic interpretation

Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
Unsupervised semantic parsing

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
Introduction to the CoNLL-2005 shared task: semantic role labeling

CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
Graph-based tools for data mining and machine learning

MLDM'03 Proceedings of the 3rd international conference on Machine learning and data mining in pattern recognition
Unsupervised Semantic Similarity Computation between Terms Using Web Documents

IEEE Transactions on Knowledge and Data Engineering
Efficient convolution kernels for dependency and constituent syntactic trees

ECML'06 Proceedings of the 17th European conference on Machine Learning

A web mining tool for assistance with creative writing

ECIR'13 Proceedings of the 35th European conference on Advances in Information Retrieval
Transfer learning of syntactic structures for building taxonomies for search engines

Engineering Applications of Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

We build an open-source toolkit which implements deterministic learning to support search and text classification tasks. We extend the mechanism of logical generalization towards syntactic parse trees and attempt to detect weak semantic signals from them. Generalization of syntactic parse tree as a syntactic similarity measure is defined as the set of maximum common sub-trees and performed at a level of paragraphs, sentences, phrases and individual words. We analyze semantic features of such similarity measure and compare it with semantics of traditional anti-unification of terms. Nearest-neighbor machine learning is then applied to relate a sentence to a semantic class. Using syntactic parse tree-based similarity measure instead of bag-of-words and keyword frequency approach, we expect to detect a weak semantic signal otherwise unobservable. The proposed approach is evaluated in a four distinct domains where a lack of semantic information makes classification of sentences rather difficult. We describe a toolkit which is a part of Apache Software Foun-dation project OpenNLP, designed to aid search engineers in tasks requiring text relevance assessment.