Next Previous Contents

3. Building the System

As I have said, I've built two clusters so far. The NCSA cluster was a NetType I, ArchType I, SoftType I 8 node, 12 processor cluster, which I actually built with the help of Kristopher Wuollett. The Illigal cluster is a NetType I, ArchType III, SoftType I/III. Unfortunately, the NCSA reinstalled the cluster (with NT, ugh.. Is that place ever going down the tubes :() and deleted all our scripts after I left, so the best I can do is talk about what we did there. I think I like the Illigal setup better anyways. It's cheaper and feels cleaner (plus synchronization is automatic).

Some assumptions: I based both my clusters on RedHat Linux 6.2. Redhat takes a lot of flak from people, but out of all the distributions I've seen, they actually have the cleanest setup. In addition, you will find it very helpful if you use a secondary package manager, such as encap or GNU Stow. These programs allow you to package third party programs (such as scripts you write) into their own directory in /usr/local/encap, and then maintain the symbolic links into /usr/local. Really comes in handy when you do maintainance.

3.1 Building the NetType

This is pretty straightforward. The only method that needs any explaining is the NetType I. Read the IP MASQing HOWTO, and have a look at my scripts. Essentially, I determine the kernel version in ipmasq.init and run the appropriate scripts for either kernel 2.4 or 2.2.

The ipmasq.init is run only on the world node. The eth0 interface is configured as normal with the external IP, and then use IP aliasing in the scripts to put the world node on both the external and internal (10.x.x.x) network segments. The subnodes are given IP's in the internal network. The method for doing this, however, depends on the ArchType.

3.2 Building the ArchType

3.3 Building the SoftType

I have only set up SoftType I clusters. If you have any experiance with MOSIX, Condor, MPI, or PVM, please mail the completed sections (in SGML if possible) to me.

3.4 Other Notes

In addition to the architectures described above, you may find it handy to install some method of password synchronization among all the nodes, such as kerberos or NIS. I installed YP/NIS on the Illigal cluster. One stumbling block I came across was that I had to use 10.0.0.1 as the address of the domain server in the yp.conf files of the subnodes, since for some reason it wouldn't use /etc/hosts to resolve the node hostname.

Another method is to mount the root node's /etc directory on each node as well, and them simply symlink that passwd, shadow and group file into /etc. This might be a bad idea though, because tools might check for symlink versions of those files for security reasons.


Next Previous Contents