Some Remarks on Vector Representations of Legal Documents

Authors:
E. Schweighofer;A. Rauber;D. Merkl
Affiliations:
-;-;-
Venue:
DEXA '00 Proceedings of the 11th International Workshop on Database and Expert Systems Applications
Year:
2000

Citing 0
Cited 2

Automatic text representation, classification and labeling in European law

Proceedings of the 8th international conference on Artificial intelligence and law
Bilingual legal document retrieval and management using XML

Software—Practice & Experience

Quantified Score

Hi-index	0.00

Visualization

Abstract

Vector representation of legal documents is still the best way for computing classification clusters and labelling of its contents. This paper deals with the problem of diversity of legal documents making vector representation a difficult task. Extensive experiments with three text corpora of about 580 documents in three languages have shown that binary or weighted vector representation may not be sufficient. Even quite successful approaches of similarity computation have problems in identifying the best context of classification. The LabelSOM method can be seen as a very efficient tool for verification of similarity because common elements are explicitly identified. Finally, some proposals for the "best" vector representation are discussed: weighted vectors, feature vectors and hierarchies of vectors using XML information for identifying similar contexts.