ACM Computing Surveys (CSUR) - Annals of discrete mathematics, 24
Document architecture and text formatting
ACM Transactions on Information Systems (TOIS)
TEXTNET: a network-based approach to text handling
ACM Transactions on Information Systems (TOIS)
Communications of the ACM
Cactis: a self-adaptive, concurrent implementation of an object-oriented database management system
ACM Transactions on Database Systems (TODS)
Automatic text processing: the transformation, analysis, and retrieval of information by computer
Automatic text processing: the transformation, analysis, and retrieval of information by computer
Bayan: a text database management system which supports a full representation of the Arabic language
Data Engineering - Non-English interfaces to databases
PDM: an object-oriented data model
OODS '86 Proceedings on the 1986 international workshop on Object-oriented database systems
CACTIS: a database system for specifying functionally-defined data
OODS '86 Proceedings on the 1986 international workshop on Object-oriented database systems
Signature files: an access method for documents and its analytical performance evaluation
ACM Transactions on Information Systems (TOIS)
Document processing in a relational database system
ACM Transactions on Information Systems (TOIS)
The Cactis Project: Database Support for Software Environments
IEEE Transactions on Software Engineering
Mind Your Grammar: a New Approach to Modelling Text
VLDB '87 Proceedings of the 13th International Conference on Very Large Data Bases
Fast Text Access Methods for Optical and Large Magnetic Disks: Designs and Performance Comparison
VLDB '88 Proceedings of the 14th International Conference on Very Large Data Bases
Text databases: a survey of text models and systems
ACM SIGMOD Record
On the cost of multilingualism in database systems
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Hi-index | 0.00 |
Most existing databases lack features which allow for the convenient manipulation of text. It is even more difficult to use them if the text language is not based on the Roman alphabet. The Arabic language is a very good example of this case. Many projects have attempted to use conventional database systems for Arabic data manipulation (including text data), but because of Arabic's many differences with English, these projects have met with limited success. In the Bayan project, the approach has been different. Instead of simply trying to adopt an environment to Arabic, the properties of the Arabic language were the starting point and everything was designed to meet the needs of Arabic, thus avoiding the shortcomings of other projects. A text database management system was designed to overcome the shortcomings of conventional database management systems in manipulating text data. Bayan's data model is based on an object-oriented approach which helps the extensibility of the system for future use. In Bayan, we designed the database with the Arabic text properties in mind. We designed it to support the way Arabic words are derived, classified, and constructed. Furthermore, linguistic algorithms (for word generation and morphological decomposition of words) were designed, leading to a formalization of rules of Arabic language writing and sentence construction. A user interface was designed on top of this environment. A new representation of the Arabic characters was designed, a complete Arabic keyboard layout was created, and a window-based Arabic user interface was also designed.