[LinuxFocus-icon]
LinuxFocus article number 363
http://linuxfocus.org

[Photo of the Author]
by Majid Hameed
<hameed.majid(at)gmail.com>

About the author:
Majid Hameed is an undergraduate student in the Department of Computer Science at the University of Karachi in Sindh, Pakistan. His main interests are artificial intelligence, operating systems, networking, programming, and computer graphics. Hameed describes himself as a Linux enthusiast who has been using Linux as an operating system for the past three-and-one-half years, including Red Hat 9, 8, 7.3, and 7.2, Slackware Linux 10 and 9.1, Slax, Mandrake Move 2, Knoppix 3.4, Vector Linux 4.3, and more.

ParallelKnoppix

[Illustration]

Abstract:

ParallelKnoppix is a live CD based on Knoppix, which is also a live CD, based on the Debian Linux Distribution. ParallelKnoppix will let us create a linux cluster equipped with parallel programming tools/libraries such as MPI in a couple of minutes. It saves a lot of time that we spend in configuration of the computing environment. The existing environment is not disturbed using ParallelKnoppix, as it is a Live CD. Only on the master node a directory is created that can be deleted after reboot if you want.

_________________ _________________ _________________

 

Introduction

"ParallelKnoppix is a re-master of Knoppix that allows setting up a cluster of machines for parallel processing using the LAM-MPI and/or MPICH implementations of MPI. Getting the cluster up and running takes less than 15 minutes, if the machines have PXE network cards." --> from http://pareto.uab.es/mcreel/ParallelKnoppix/  

Background

Clustering is one of the cheapest techniques to achieve Parallelism. Clustering by using linux is one of the linux powers. The universities and organization mimic super computing by connecting PCs through Ethernet Cards under Linux. Linux is highly adopted by scientific community to do their research work as linux is loaded with a number of scientific tools such as LAM, PI, PVM and many more. So linux is best suited for parallel computing. But the problem is scientist and programmers have a lot to do with some pre-configuration of the linux environment. This makes there task slow and complex. The problem of configuration becomes even worse if the existing environment is non-linux (that is windows) based environment.

Now linux gurus solved this problem and they have developed Live CDs. Now the researcher can choose a live CD to do some parallel programming without doing the long long configuration and the cluster is ready within a couple of minutes (7 - 8) minutes.

One of the Live CD for parallel programming is ParallelKnoppix.

Some other Live CDs for Parallel Computing are BCCD and ClusterKnoppix.  

Description

Just like its predecessor (that is Knoppix) ParallelKnoppix will detect all the hardware and peripheral automatically. I have tested it on D865GBF Intel board a PIV board and Intel 810C a PIII board and ParallelKnoppix configured all the hardware automatically nothing need to be done. The computers that are configured using ParallelKnoppix share a common directory, which is created on the master node by NFS (network file system). The master node is booted from the CD and the slaves are booted over the network (DHCP running on master node). The slaves have PXE enabled bios with PXE compliant NICs.

Each and every service needed for LAM/MPI is configured automatically (LAM/MPI is a message passing interface specification used for parallel computing). Like DHCP, NFS, SSH (password less logins) and you are ready to experiment MPI programs plus some other parallel applications.

The setup of ParallelKnoppix is not very secure as the live CD password both for a user and super user (root) are publicly known any one who has some knowledge of ParallelKnoppix will get access to the ParallelKnoppix Cluster. Actually the ease of setup is obtained by some compromising some security. As there is a trade off between ease of use and security.  

What is PXE boot?

PXE boot is an acronym for Preboot Execution Environment boot. PXE is a technology that is used to boot a PC remotely through a network. PXE is supported by the system BIOS and the network interface card need to be PXE compliant.  

What to do if your NIC is not PXE compliant?

You have to put ether boot images or burn a cd using the images.ROM-o-matic.net dynamically generates Etherboot ROM images. http://rom-o-matic.net/  

Downloading ParrallelKnoppix

ISO file download

FTP exact link

http://pareto.uab.es/mcreel/ParallelKnoppix/parallelknoppix.iso

HTTP exact link

ftp://volcano.uab.es/pub/parallelknoppix.iso

MD5SUM download

http://pareto.uab.es/mcreel/ParallelKnoppix/parallelknoppix-2004-12-16.iso.md5

Check the home page http://pareto.uab.es/mcreel/ParallelKnoppix/ if the
above links expires
After downloading the ISO images, check the MD5 checksums for the ISO images to ensure that your download was successful. Do this by running the md5sum program from a shell prompt against your ISO images and comparing the values returned against the md5 file (link is below for download). The following illustrates the correct syntax for the md5sum command.
md5sum "isofilename"
In the above command, replace "isofilename" with the correct file name.

If you are for some reason not using Linux, then use the md5Summer a Windows MD5sum generator, below is the link.

http://www.md5summer.com/

Note: writing the ISOs to CD requires a program such as cdrecord.  

How it works?

There is a nice tutorial full of step by step screen shots of the configuration process below is the link to the tutorial.

Parallel Knoppix tutorial html version

http://pareto.uab.es/mcreel/ParallelKnoppix/Tutorial/Tutorial.html

Parallel Knoppix tutorial pdf version

http://pareto.uab.es/wp/2004/62604.pdf

If you exported your CD Rom to the nodes it will easily accommodate 50 nodes but not more than 50 nodes are tested. I actually tested only 5 nodes my self.  

What to do if multiple DHCP is running?

"If using this at a university (like I do), you're likely to encounter the existence of an official DHCP server, and possibly a PXE server. When you try to boot the nodes using the terminal server, the nodes will often boot from the pre-existing PXE server, and they will often get their IP addresses from the official server, not the DHCP server running on the computer that was booted from the ParallelKnoppix CD. The solution I have so far is to physically disconnect the computers to be used as nodes from the pre-existing PXE and/or DHCP servers, or else to get help from the administrators to temporarily disable those servers. If anyone knows a more elegant solution, I'd like to hear about it. I think it involves messing around with miniroot.gz, and using rom-o-matic to create the PXE boot ROM. Too horrible for further contemplation..., at least for me." --> from http://pareto.uab.es/mcreel/ParallelKnoppix/  

How it works (summary)

The ParrallelKnoppix Live CD is used to boot a master node. On the booted master node a script is executed which sets up a DHCP server, to share a common working directory to all nodes using NFS, public keys are generated for SSH to work properly (password less logins) needed for LAM. After the DHCP master node is running the slave nodes are booted using PXE boot. After the successful booting the sample directory of programs is pasted to the NFS shared common directory and parallel programs are executed in parallel on multiple PCs.  

My experience

I am an undergraduate student of computer science and I was given a project to solve a mathematical problem using MPI in parallel computing lab. I chooses ParallelKnoppix as an alternate to demonstrate my MPI program in Linux environment. The master node is booted using the ParallelKnoppix CD some time during booting it will ask you the resolution just enter "6" because it is the maximum resolution mode supported. My master node was booted I run Setup ParallelKnoppix script by K>ParallelKnoopixx>Setup ParallelKnoppix (see the above tutorial). After the script has created DHCP server I turned on my slave nodes and let them boot using PXE. After that all the nodes are successfully booted.

I copy my program to the "parallel_knoppix_working" directory and then using a terminal I run my mpi program in parallel that's it.
For compilation I use

mpicc myprogram.c -o myprogram.bin

For execution I use

mpirun C myprogram.bin
 

Conclusion

"The ParallelKnoppix CD provides a very simple and rapid means of setting up a cluster of heterogeneous PCs of the IA-32 architecture. It is not intended to provide a stable cluster for multiple users, rather is a tool for rapid creation of a cluster for individual use. The CD itself is personalizable, and the configuration and working files can be re-used over time, so it can provide a long-term solution for an individual user." From ParallelKnoppix Tutorial By Michael Creel  

References



Webpages maintained by the LinuxFocus Editor team
© Majid Hameed
"some rights reserved" see linuxfocus.org/license/
http://www.LinuxFocus.org
Translation information:
en --> -- : Majid Hameed <hameed.majid(at)gmail.com>

2005-01-14, generated by lfparser_pdf version 2.51