How many dots are really needed for head-driven chart parsing?

Authors:
Pavel Smrž;Vladimír Kadlec
Affiliations:
Faculty of Information Technology, Brno University of Technology, Brno, Czech Republic;Faculty of Informatics, Masaryk University, Brno, Czech Republic
Venue:
SOFSEM'06 Proceedings of the 32nd conference on Current Trends in Theory and Practice of Computer Science
Year:
2006

Citing 11
Cited 0

A recursive ascent Earley parser

Information Processing Letters
A faster Scrabble move generation algorithm

Software—Practice & Experience
Trie memory

Communications of the ACM
Parsing Schemata: A Framework for Specification and Analysis of Parsing Algorithms

Parsing Schemata: A Framework for Specification and Analysis of Parsing Algorithms
Enhancing Best Analysis Selection and Parser Comparison

TSD '02 Proceedings of the 5th International Conference on Text, Speech and Dialogue
Implementation of Efficient and Portable Parser for Czech

TSD '99 Proceedings of the Second International Workshop on Text, Speech and Dialogue
Probabilistic Head-Driven Chart Parsing of Czech Sentences

TDS '00 Proceedings of the Third International Workshop on Text, Speech and Dialogue
Exploiting syntactic structure for language modeling

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
An extended theory of head-driven parsing

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Incremental construction of minimal acyclic finite state automata and transducers

FSMNLP '09 Proceedings of the International Workshop on Finite State Methods in Natural Language Processing
Large scale parsing of Czech

Proceedings of the COLING-2000 Workshop on Efficiency In Large-Scale Parsing Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents an improved form of head-driven chart parser that is appropriate for large context-free grammars. The basic method – HDddm (Head-Driven dependent dot move) – is introduced first. Both variants that improve the basic approach are based on the same idea – to reduce the number of chart edges by modifying the form of items (dotted rules). The first one “unifies” the items that share the analyzed part of the relevant rule (thus, only one dot is needed to mark the position before and after the covered part). The second method applies the inverse strategy, it “eliminates” the parts that have not been covered yet (no dot needed). All the discussed alternatives are described in the form of parsing schemata. We also shortly mention a tricky technique (employing a special trie-like data structure developed originally for Scrabble) that enables minimizing the extra information needed in the algorithms. We demonstrate the advantages of the described methods by the significant decrease in the number of edges for charts. The results are given for the standard set of testing grammars (and respective inputs) as well as for a large and highly ambiguous Czech grammar.