Building a discourse parser for informal mathematical discourse in the context of a controlled natural language

Authors:
Raúl Ernesto Gutiérrez de Piñerez Reyes;Juan Francisco Díaz Frias
Affiliations:
EISC, Universidad del Valle, Colombia;EISC, Universidad del Valle, Colombia
Venue:
CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
Year:
2013

Citing 12
Cited 0

The nature of statistical learning theory

The nature of statistical learning theory
Making large-scale support vector machine learning practical

Advances in kernel methods
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
An introduction to ROC analysis

Pattern Recognition Letters - Special issue: ROC analysis in pattern recognition
Design of a multi-lingual, parallel-processing statistical parsing engine

HLT '02 Proceedings of the second international conference on Human Language Technology Research
Introduction to Information Retrieval

Introduction to Information Retrieval
Narrative Structure of Mathematical Texts

Calculemus '07 / MKM '07 Proceedings of the 14th symposium on Towards Mechanized Mathematical Assistants: 6th International Conference
Discourse Connective Argument Identification with Connective Specific Rankers

ICSC '08 Proceedings of the 2008 IEEE International Conference on Semantic Computing
Attribution and the (non-)alignment of syntactic and discourse arguments of connectives

CorpusAnno '05 Proceedings of the Workshop on Frontiers in Corpus Annotations II: Pie in the Sky
The naproche project controlled natural language proof checking of mathematical texts

CNL'09 Proceedings of the 2009 conference on Controlled natural language
MathAbs: a representational language for mathematics

Proceedings of the 8th International Conference on Frontiers of Information Technology
Preprocessing of informal mathematical discourse in context ofcontrolled natural language

Proceedings of the 21st ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

The lack of specific data sets makes difficult the discourse parsing for Informal Mathematical Discourse (IMD). In this paper, we propose a data driven approach to identify arguments and connectives in an IMD structure within the context of Controlled Natural Language (CNL). Our approach follows a low-level discourse parsing under Peen Discourse TreeBank (PDTB) guidelines. Three classifiers have been trained: one that identifies the Arg2, other that locates the relative position of Arg1 and a third that identifies the (Arg1 and Arg2) arguments of each connective. These classifiers are instances of Support Vector Machines (SVMs), fed from an own Mathematical TreeBank. Finally, our approach defines an End-to-End discourse parser into IMD, whose results will be used to classify of informal deductive proofs via the low level discourse in IMD processing.