Syntactic Extraction Approach to Processing Local Document Collections

Authors:
Jolanta Mizera-Pietraszko
Affiliations:
Department of Information Systems, Institute of Informatics, Wroclaw University of Technology, Wroclaw, Poland 50-370
Venue:
FQAS '09 Proceedings of the 8th International Conference on Flexible Query Answering Systems
Year:
2009

Citing 6
Cited 0

Maximum Entropy Markov Models for Information Extraction and Segmentation

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Question answering from the web using knowledge annotation and knowledge mining techniques

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Controlling gender equality with shallow NLP techniques

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Using only cross-document relationships for both generic and topic-focused multi-document summarizations

Information Retrieval
An effective and efficient results merging strategy for multilingual information retrieval in federated search environments

Information Retrieval
Using support vector machines for terrorism information extraction

ISI'03 Proceedings of the 1st NSF/NIJ conference on Intelligence and security informatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Techniques of processing databases like free text searching, or proximity search are one of the key factors that influence efficiency of query answering. Since most users prefer querying systems in natural language, a correct answer formulation based on the electronic document content seems a real challenge. Processing queries in multilingual environment usually impedes the system responsiveness even more. This paper proposes an approach of overcoming these obstacles by implementation of syntactic information extraction. Some evaluation methodologies commonly used by TREC, NTCIR, SIGIR etc are studied in order to suggest that it is not only a system architecture itself, a translation model or the document format, but also other factors that determine the system performance. The shallow technique of the syntactic information extraction used appears to be a robust of the system described. In this light, it is possible to achieve comparable results when processing monolingual and cross-lingual collections.