Seeing as how I've had a bit of experience with Beowulf clusters, I've decided to take it upon myself to write some documentation about the clusters I've designed, and clusters in general. (Not to mention I'm getting paid to do so :).
I've had the privilege to design two Beowulf clusters. One for the NCSA Automated Learning Group and the other for the Illinois Genetic Algorithms Lab.
So what the hell is a Beowulf Cluster then? Well, a Beowulf cluster is just a group of loosely synchronized machines dedicated to a single purpose. So essentially, you can take a bunch Linux machines, throw them all on a hub, and call it a Beowulf cluster, and you would be correct.
There are several sources for information on Beowulf on the web and even in print, so much so that the typical user might be overwhelmed. Moreover, for various reasons, some documents seem to try to make Beowulf seem more complicated or wordy that it actually is (Printed media is usually the most guilty of this, as they have physical pages to fill). A nice place to start is probably the Beowulf HOWTO, or right here. I've tried to be as factual and concise as possible, but keep in mind that currently this document is geared toward the use and maintenance of an actual cluster I have built, rather than the design. I will add some more design material on my own time.
The rest of the documentation is written in sgml.