T2: a customizable parallel database for multi-dimensional data
ACM SIGMOD Record
The multidimensional database system RasDaMan
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Titan: A High-Performance Remote Sensing Database
ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Efficient Organization of Large Multidimensional Arrays
Proceedings of the Tenth International Conference on Data Engineering
Flexible and efficient IR using array databases
The VLDB Journal — The International Journal on Very Large Data Bases
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
The DataPath system: a data-centric analytic processing engine for large data warehouses
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
ArrayStore: a storage manager for complex parallel array processing
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
SSDBM'11 Proceedings of the 23rd international conference on Scientific and statistical database management
SciHadoop: array-based query processing in Hadoop
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
SciQL: bridging the gap between science and relational DBMS
Proceedings of the 15th Symposium on International Database Engineering & Applications
RAM: a multidimensional array DBMS
EDBT'04 Proceedings of the 2004 international conference on Current Trends in Database Technology
GLADE: big data analytics made easy
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Data vaults: a symbiosis between database technology and scientific file repositories
SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
Hi-index | 0.00 |
Scientific data have dual structure. Raw data are preponderantly ordered multi-dimensional arrays or sequences while metadata and derived data are best represented as unordered relations. Scientific data processing requires complex operations over arrays and relations. These operations cannot be expressed using only standard linear and relational algebra operators, respectively. Existing scientific data processing systems are designed for a single data model and handle complex processing at the application level. EXTASCID is a complete and extensible system for scientific data processing. It supports both array and relational data natively. Complex processing is handled by a metaoperator that can execute any user code. As a result, EXTASCID can process full scientific workflows inside the system, with minimal data movement and application code. We illustrate the overall process on a real dataset and workflow from astronomy---starting with a set of sky images, the goal is to identify and classify transient astrophysical objects.