Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
A simple rule-based part of speech tagger
ANLC '92 Proceedings of the third conference on Applied natural language processing
New Directions in Question Answering
New Directions in Question Answering
Orange: from experimental machine learning to interactive data mining
PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data
Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data
NLTK: the Natural Language Toolkit
ETMTNLP '02 Proceedings of the ACL-02 Workshop on Effective tools and methodologies for teaching natural language processing and computational linguistics - Volume 1
Learning to identify single-snippet answers to definition questions
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Unsupervised Multilingual Sentence Boundary Detection
Computational Linguistics
A Fully Automatic Crossword Generator
ICMLA '08 Proceedings of the 2008 Seventh International Conference on Machine Learning and Applications
Towards the automatic extraction of definitions in Slavic
ACL '07 Proceedings of the Workshop on Balto-Slavonic Natural Language Processing: Information Extraction and Enabling Technologies
ECODE: A Definition Extraction System
Human Language Technology. Challenges of the Information Society
An account of the challenge of tagging a reference corpus for Brazilian Portuguese
PROPOR'03 Proceedings of the 6th international conference on Computational processing of the Portuguese language
Automatic extraction of definitions in Portuguese: a rule-based approach
EPIA'07 Proceedings of the aritficial intelligence 13th Portuguese conference on Progress in artificial intelligence
Learning word-class lattices for definition and hypernym extraction
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Description and evaluation of a definition extraction system for Spanish language
WDE '09 Proceedings of the 1st Workshop on Definition Extraction
Evolutionary algorithms for definition extraction
WDE '09 Proceedings of the 1st Workshop on Definition Extraction
LIBSVM: A library for support vector machines
ACM Transactions on Intelligent Systems and Technology (TIST)
The DIOGENE question answering system at CLEF-2004
CLEF'04 Proceedings of the 5th conference on Cross-Language Evaluation Forum: multilingual Information Access for Text, Speech and Images
Hi-index | 0.00 |
In order to avoid ambiguity and to ensure, as far as possible, a strict interpretation of law, legal texts usually define the specific lexical terms used within their discourse by means of normative rules. With an often large amount of rules in effect in a given domain, extracting these definitions manually would be a costly undertaking. This paper presents an approach to cope with this problem based in a variation of an automated technique of natural language processing of Brazilian Portuguese texts. For the sake of generality, the proposed solution was developed to address the more general problem of building a glossary from domain specific texts that contain definitions amongst their content. This solution was applied to a corpus of texts on the telecommunications regulations domain and the results are reported. The usual pipeline of natural language processing has been followed: preprocessing, segmentation, and part-of-speech tagging. A set of feature extraction functions is specified and used along with reference glossary information on whether or not a text fragment is a definition, to train a SVM classifier. At last, the definitions are extracted from the texts and evaluated upon a testing corpus, which also contains the reference glossary annotations on definitions. The results are then discussed in light of other definition extraction techniques.