A
hierarchical data replication method in scientific data grid
Weizhong Lv1,2,
Yuanchun Zhou1, Kaichao Wu1, Baoping Yan1
1Computer Network Information Center, Chinese Academy of Sciences, Beijing, China
2Graduate University of Chinese Academy of Sciences, Beijing, China
Email: lvweizhong@sdb.cnic.cn
The Scientific Data Grid
(SDG) of Chinese Academy of Sciences (CAS),
which is based on the scientific database hosted and supported by different
institutes, is a fundamental infrastructure for many data-intensive natural
scientific research projects. The scientific database is the major scientific
and technological information resources with a total data volume of hundreds of
terabytes and consists of geographically distributed and heterogeneous
multidisciplinary data resources. In the grid and distributed computing
environment, data replication is an effective way to improve the data
accessibility and accessing efficiency because data intensive applications
produce large amounts of datasets for reliability and performance. A data replication
method called Hierarchical Replication Model (HRM) is proposed in this paper.
This method selects grid nodes as data replicas holders considering the
information on data access frequencies, network topology, and information on
links bandwidth. This method groups the network into three hierarchies to
improve the data accessibility. This paper presents the framework of the
Hierarchical Replication Model and gives a detailed resolution for simulating. Furthermore,
simulation results have shown that this method speeds up the data
transportation and increases the data accessing efficiency, and verified the
effectiveness of this method at the same time.