Pages

Monday, October 25, 2010

NFSP NETWORK FILE SYSTEM PARALLELIZED

NFSP NETWORK FILE SYSTEM PARALLELIZED

INTRODUCTION:


The most recent trend in parallelism is the rise of the “poor man” supercomputer, that is clusters of PCs connected through a dedicated, sometimes high performance , network. This kind of clusters are now called Beowulf Clusters. Many years later many works have been done to take full advantage of this architecture in many fields such as scheduling , load-balancing, remote login , programming environments and runtime.

A consequence of the increasing popularity of Beowulf Clusters has been their increasing size (in number of nodes). Yet, the hard drives available on these nodes are only used for the system and temporary files , thus wasting a lot of space and even several terabytes on large clusters! . The systems that might help recycling this otherwise-unused space are few and far between.

Many works have also been done on file systems. However these solutions do not completely take into account the characteristics of a cluster dedicated to heavy computation. These characteristics are
• high availability
• local network
• secure environment
• large disk space on every node and
• use of standard, as opposed to highly specialized, software.

It is not yet possible to use a subset of nodes to be a distributed file server without using a new protocol , which often requires a kernel modification , or a complete reinstallation, at worst, of all the clients.
The industrial solutions for storage are mainly dedicated NFS servers shipping with several high performance disks “merged” by means of a hard-ware RAID technology. This solution works but is quite expensive to buy and doesn’t make use of the disk space available on the nodes of the cluster. It also must be connected to a host computer which is often a performance bottleneck. Plus if the host computer crashes the RAID becomes unavailable.
Therefore the new system NFSP which stands for Network File System Parallelized aims at providing a solution that
• enables the use of the disk space of all, or a subset of, the nodes of a
cluster.
• gives a unique and unified view of this disk space,
• offers performance good enough to saturate the bandwidth of the network,
• no modifications on the client side.

Because of the last point the Network File System protocol was chosen as the one to be used on the client side , the server being basically cut into two entities one that stores the file system structure (also known as metadata) and the other the real - data. Using this approach we can combine the flexibility and wide availability of NFS together with good performances.

The rest of this report is organized as follows , in chapter two existing works done on distributed file systems are presented and commented with respect to a cluster environment. The chapter three give information about the traditional NFS server. Then in chapter four the NFSP proposal is explained. Then in the following chapter , five, some technical issues that have to be dealt with to implement the Parallelized Network File System and then chapter six gives the installation details along with the first results that were obtained when a prototype was implemented in a French institute. Chapter seven gives the conclusion which also includes the works that can be done to improve this concept more.


for more info visit.
http://www.enjineer.com/forum

No comments:

Post a Comment