Elements of a Cluster

A Beowulf cluster comprises numerous components of both hardware and software. Unlike pure closed-box turnkey mainframes, servers, and workstations, the user or hosting organization has considerable choice in the system architecture of a cluster, whether it is to be assembled on site from parts or provided by a systems integrator or vendor. A Beowulf cluster system can be viewed as being made up of four major components, two hardware and two software. The two hardware components are the compute nodes that perform the work and the network that interconnects the node to form a single system. The two software components are the collection of tools used to develop user parallel application programs and the software environment for managing the parallel resources of the Beowulf cluster. The specification of a Beowulf cluster reflects user choices in each of these domains and determines the balance of cost, capacity, performance, and usability of the system.

The hardware node is the principal building block of the physical cluster system. After all, it is the hardware node that is being clustered. The node incorporates the resources that provide both the capability and capacity of the system. Each node has one or more microprocessors that provide the computing power of the node combined on the node's motherboard with the DRAM main memory and the I/O interfaces. In addition the node will usually include one or more hard disk drives for persistent storage and local data buffering although some clusters employ nodes that are diskless to reduce both cost and power consumption as well as increase reliability.

The network provides the means for exchanging data among the cluster nodes and coordinating their operation through global synchronization mechanisms. The subcomponents of the network are the network interface controllers (NIC), the network channels or links, and the network switches. Each node contains at least one NIC that performs a series of complex operations to move data between the external network links and the user memory, conducting one or more transformations on the data in the process. The channel links are usually passive, consisting of a single wire, multiple parallel cables, or optical fibers. The switches interconnect a number of channels and route messages between them. Networks may be characterized by their topology, their bisection and per channel bandwidth, and the latency for message transfer.

The software tools for developing applications depend on the underlying programming model to be used. Fortunately, within the Beowulf cluster community, there has been a convergence of a single dominant model: communicating sequential processes, more commonly referred to as message passing. The message-passing model implements concurrent tasks or processes on each node to do the work of the application. Messages are passed between these logical tasks to share data and to synchronize their operations. The tasks themselves are written in a common language such as Fortran or C++. A library of communicating services is called by these tasks to accomplish data transfers with tasks being performed on other nodes. While many different message-passing languages and implementation libraries have been developed over the past two decades, two have emerged as dominant: PVM and MPI (with multiple library implementations available for MPI).

The software environment for the management of resources gives system administrators the necessary tools for supervising the overall use of the machine and gives users the capability to schedule and share the resources to get their work done. Several schedulers are available and discussed in this book. For coarse-grained job stream scheduling, the popular Condor scheduler is available. PBS and the Maui scheduler handle task scheduling for interactive concurrent elements. For lightweight process management, the new Scyld Bproc scheduler will provide efficient operation. PBS also provides many of the mechanisms needed to handle user accounts. For managing parallel files, there is PVFS, the Parallel Virtual File System.

Was this article helpful?

0 0
The Ultimate Computer Repair Guide

The Ultimate Computer Repair Guide

Read how to maintain and repair any desktop and laptop computer. This Ebook has articles with photos and videos that show detailed step by step pc repair and maintenance procedures. There are many links to online videos that explain how you can build, maintain, speed up, clean, and repair your computer yourself. Put the money that you were going to pay the PC Tech in your own pocket.

Get My Free Ebook

Post a comment