Comparing Hadoop and Fat-Btree based access method for small file I/O applications

Authors:
Min Luo;Haruo Yokota
Affiliations:
Department of Computer Science, Tokyo Institute of Technology, Tokyo, Japan;Global Scientific Information and Computing Center, Tokyo Instititute of Technology, Tokyo, Japan
Venue:
WAIM'10 Proceedings of the 11th international conference on Web-age information management
Year:
2010

Citing 14
Cited 1

Prototyping Bubba, A Highly Parallel Database System

IEEE Transactions on Knowledge and Data Engineering
The Gamma Database Machine Project

IEEE Transactions on Knowledge and Data Engineering
Fat-Btree: An Update-Conscious Parallel Directory Structure

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Postgres-R(SI): Combining Replica Control with Concurrency Control Based on Snapshot Isolation

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
MapReduce: simplified data processing on large clusters

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Pig latin: a not-so-foreign language for data processing

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
SCOPE: easy and efficient parallel processing of massive data sets

Proceedings of the VLDB Endowment
A comparison of approaches to large-scale data analysis

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Small-file access in parallel file systems

IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
MapReduce and parallel DBMSs: friends or foes?

Communications of the ACM - Amir Pnueli: Ahead of His Time
Hadoop high availability through metadata replication

Proceedings of the first international workshop on Cloud data management
Biodoop: Bioinformatics on Hadoop

ICPPW '09 Proceedings of the 2009 International Conference on Parallel Processing Workshops
Hive: a warehousing solution over a map-reduce framework

Proceedings of the VLDB Endowment
HadoopDB: an architectural hybrid of MapReduce and DBMS technologies for analytical workloads

Proceedings of the VLDB Endowment

An optimized approach for storing and accessing small files on cloud storage

Journal of Network and Computer Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Hadoop has been widely used in various clusters to build scalable and high performance distributed file systems. However, Hadoop distributed file system (HDFS) is designed for large file management. In case of small files applications, those metadata requests will flood the network and consume most of the memory in Namenode thus sharply hinders its performance. Therefore, many web applications do not benefit from clusters with centered metanode, like Hadoop. In this paper, we compare our Fat-Btree based data access method, which excludes center node in clusters, with Hadoop. We show their different performance in different file I/O applications.