Semantic relation extraction from legislative text using generalized syntactic dependencies and support vector machines

Authors:
Guido Boella;Luigi Di Caro;Livio Robaldo
Affiliations:
Department of Computer Science, University of Turin, Turin, Italy;Department of Computer Science, University of Turin, Turin, Italy;Department of Computer Science, University of Turin, Turin, Italy
Venue:
RuleML'13 Proceedings of the 7th international conference on Theory, Practice, and Applications of Rules on the Web
Year:
2013

Citing 10
Cited 0

Support-Vector Networks

Machine Learning
A vector space model for automatic indexing

Communications of the ACM
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Automatic acquisition of hyponyms from large text corpora

COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Finding parts in very large corpora

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Ontology generation for large email collections

dg.o '08 Proceedings of the 2008 international conference on Digital government research
Creating tag hierarchies for effective navigation in social media

Proceedings of the 2008 ACM workshop on Search in social media
The WEKA data mining software: an update

ACM SIGKDD Explorations Newsletter
Semi-automatic construction of topic ontologies

EWMF'05/KDO'05 Proceedings of the 2005 joint international conference on Semantics, Web and Mining
Eunomos, a legal document and knowledge management system to build legal services

AICOL'11 Proceedings of the 25th IVR Congress conference on AI Approaches to the Complexity of Legal Systems: models and ethical challenges for legal systems, legal language and legal ontologies, argumentation and software agents

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper we present a technique to automatically extract semantic knowledge from legislative text. Instead of using pattern matching methods relying on lexico-syntactic patterns, we propose a technique which uses syntactic dependencies between terms extracted with a syntactic parser. The idea is that syntactic information are more robust than pattern matching approaches when facing length and complexity of the sentences. Relying on a manually annotated legislative corpus, we transform all the surrounding syntax of the semantic information into abstract textual representations, which are then used to create a classification model by means of a standard Support Vector Machine system. In this work, we initially focus on three different semantic tags, achieving very high accuracy levels on two of them, demonstrating both the limits and the validity of the approach.