Bayan: an Arabic text database management system

Authors:
Roger King;Ali Morfeq
Affiliations:
Department of Computer Science, University of Colorado, Boulder, Colorado;Department of Computer Science, University of Colorado, Boulder, Colorado
Venue:
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Year:
1990

Citing 14
Cited 2

Access methods for text

ACM Computing Surveys (CSUR) - Annals of discrete mathematics, 24
Document architecture and text formatting

ACM Transactions on Information Systems (TOIS)
TEXTNET: a network-based approach to text handling

ACM Transactions on Information Systems (TOIS)
Arabic word processing

Communications of the ACM
Cactis: a self-adaptive, concurrent implementation of an object-oriented database management system

ACM Transactions on Database Systems (TODS)
Automatic text processing: the transformation, analysis, and retrieval of information by computer

Automatic text processing: the transformation, analysis, and retrieval of information by computer
Bayan: a text database management system which supports a full representation of the Arabic language

Data Engineering - Non-English interfaces to databases
PDM: an object-oriented data model

OODS '86 Proceedings on the 1986 international workshop on Object-oriented database systems
CACTIS: a database system for specifying functionally-defined data

OODS '86 Proceedings on the 1986 international workshop on Object-oriented database systems
Signature files: an access method for documents and its analytical performance evaluation

ACM Transactions on Information Systems (TOIS)
Document processing in a relational database system

ACM Transactions on Information Systems (TOIS)
The Cactis Project: Database Support for Software Environments

IEEE Transactions on Software Engineering
Mind Your Grammar: a New Approach to Modelling Text

VLDB '87 Proceedings of the 13th International Conference on Very Large Data Bases
Fast Text Access Methods for Optical and Large Magnetic Disks: Designs and Performance Comparison

VLDB '88 Proceedings of the 14th International Conference on Very Large Data Bases

Text databases: a survey of text models and systems

ACM SIGMOD Record
On the cost of multilingualism in database systems

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most existing databases lack features which allow for the convenient manipulation of text. It is even more difficult to use them if the text language is not based on the Roman alphabet. The Arabic language is a very good example of this case. Many projects have attempted to use conventional database systems for Arabic data manipulation (including text data), but because of Arabic's many differences with English, these projects have met with limited success. In the Bayan project, the approach has been different. Instead of simply trying to adopt an environment to Arabic, the properties of the Arabic language were the starting point and everything was designed to meet the needs of Arabic, thus avoiding the shortcomings of other projects. A text database management system was designed to overcome the shortcomings of conventional database management systems in manipulating text data. Bayan's data model is based on an object-oriented approach which helps the extensibility of the system for future use. In Bayan, we designed the database with the Arabic text properties in mind. We designed it to support the way Arabic words are derived, classified, and constructed. Furthermore, linguistic algorithms (for word generation and morphological decomposition of words) were designed, leading to a formalization of rules of Arabic language writing and sentence construction. A user interface was designed on top of this environment. A new representation of the Arabic characters was designed, a complete Arabic keyboard layout was created, and a window-based Arabic user interface was also designed.