Research and implement of real-time data loading system IMIL

  • Authors:
  • Han WeiHong;Jia Yan;Yang ShuQiang

  • Affiliations:
  • Computer School, National University of Defense Technology, Changsha, China;Computer School, National University of Defense Technology, Changsha, China;Computer School, National University of Defense Technology, Changsha, China

  • Venue:
  • WISE'06 Proceedings of the 7th international conference on Web Information Systems
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

With rapid development of the Internet and communication technology, massive data has been accumulated in many web-based applications such as deep web applications and web search engines. Increasing data volumes pose enormous challenges to data-loading techniques. This paper presents a data loading system in real time, the IMIL (Internet Monitoring Information Loader) that is used in RT-IMIS (Real-time Internet Monitoring Information System), which monitors real-time internet flux, manages network security, and collects a mass of Internet real-time information. IMIL consists of an extensible fault-tolerant hardware architecture, an efficient algorithm for bulk data loading using SQL*Loader and exchange partition mechanism, optimized parallelism, and guidelines for system tuning. Performance studies show the positive effects of these techniques with loading speed of every Cluster, increasing from 220 million records per day to 1.2 billion per day, and achieving the top loading speed of 6TB data when 10 Clusters are in parallel. This framework offers a promising approach for loading other large and complex databases.