Prototyping Bubba, A Highly Parallel Database System
IEEE Transactions on Knowledge and Data Engineering
The Gamma Database Machine Project
IEEE Transactions on Knowledge and Data Engineering
Fat-Btree: An Update-Conscious Parallel Directory Structure
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Postgres-R(SI): Combining Replica Control with Concurrency Control Based on Snapshot Isolation
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Pig latin: a not-so-foreign language for data processing
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
SCOPE: easy and efficient parallel processing of massive data sets
Proceedings of the VLDB Endowment
A comparison of approaches to large-scale data analysis
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Small-file access in parallel file systems
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
MapReduce and parallel DBMSs: friends or foes?
Communications of the ACM - Amir Pnueli: Ahead of His Time
Hadoop high availability through metadata replication
Proceedings of the first international workshop on Cloud data management
Biodoop: Bioinformatics on Hadoop
ICPPW '09 Proceedings of the 2009 International Conference on Parallel Processing Workshops
Hive: a warehousing solution over a map-reduce framework
Proceedings of the VLDB Endowment
HadoopDB: an architectural hybrid of MapReduce and DBMS technologies for analytical workloads
Proceedings of the VLDB Endowment
An optimized approach for storing and accessing small files on cloud storage
Journal of Network and Computer Applications
Hi-index | 0.00 |
Hadoop has been widely used in various clusters to build scalable and high performance distributed file systems. However, Hadoop distributed file system (HDFS) is designed for large file management. In case of small files applications, those metadata requests will flood the network and consume most of the memory in Namenode thus sharply hinders its performance. Therefore, many web applications do not benefit from clusters with centered metanode, like Hadoop. In this paper, we compare our Fat-Btree based data access method, which excludes center node in clusters, with Hadoop. We show their different performance in different file I/O applications.