ParallelKnoppix
ArticleCategory: [Choose a category, translators: do not translate
this, see list below for available categories]
Applications
AuthorImage:[Here we need a little image from you]
TranslationInfo:[Author + translation history. mailto: or
http://homepage]
original in en Majid Hameed
AboutTheAuthor:[A small biography about the author]
Majid Hameed is an undergraduate student in the Department of Computer
Science at the University of Karachi in Sindh, Pakistan. His main
interests are artificial intelligence, operating systems, networking,
programming, and computer graphics. Hameed describes himself as a
Linux enthusiast who has been using Linux as an operating system for
the past three-and-one-half years, including Red Hat 9, 8, 7.3, and
7.2, Slackware Linux 10 and 9.1, Slax, Mandrake Move 2, Knoppix 3.4,
Vector Linux 4.3, and more.
Abstract:[Here you write a little summary]
ParallelKnoppix is a live CD based on Knoppix, which is also a live CD,
based on the Debian Linux Distribution. ParallelKnoppix will let us create a
linux cluster equipped with parallel programming tools/libraries such as
MPI in a couple of minutes. It saves a lot of time that we spend in
configuration of the computing environment. The existing environment is
not disturbed using ParallelKnoppix, as it is a Live CD. Only on the master
node a directory is created that can be deleted after reboot if you want.
ArticleIllustration:[One image that will end up at the top of the
article]
ArticleBody:[The main part of the article]
Introduction
"ParallelKnoppix is a re-master of Knoppix that allows setting up a
cluster of machines for parallel processing using the LAM-MPI and/or MPICH
implementations of MPI. Getting the cluster up and running takes less than
15 minutes, if the machines have PXE network cards." --> from
http://pareto.uab.es/mcreel/ParallelKnoppix/
Background
Clustering is one of the cheapest techniques to achieve Parallelism.
Clustering by using linux is one of the linux powers. The universities and
organization mimic super computing by connecting PCs through Ethernet
Cards under Linux. Linux is highly adopted by scientific community
to do their research work as linux is loaded with a number of scientific
tools such as LAM, PI, PVM and many more. So linux is best suited for
parallel computing. But the problem is scientist and programmers have a
lot to do with some pre-configuration of the linux environment. This makes
there task slow and complex. The problem of configuration becomes even
worse if the existing environment is non-linux (that is windows) based
environment.
Now linux gurus solved this problem and they have developed Live CDs. Now the
researcher can choose a live CD to do some parallel programming without
doing the long long configuration and the cluster is ready within a couple
of minutes (7 - 8) minutes.
One of the Live CD for parallel programming is ParallelKnoppix.
Some other Live CDs for Parallel Computing are BCCD and ClusterKnoppix.
Description
Just like its predecessor (that is Knoppix) ParallelKnoppix will detect all
the hardware and peripheral automatically. I have tested it on D865GBF
Intel board a PIV board and Intel 810C a PIII board and ParallelKnoppix
configured all the hardware automatically nothing need to be done. The
computers that are configured using ParallelKnoppix share a common
directory, which is created on the master node by NFS (network file system).
The master node is booted from the CD and the slaves are booted over the network
(DHCP running on master node). The slaves have PXE enabled bios with PXE
compliant NICs.
Each and every service needed for LAM/MPI is configured automatically
(LAM/MPI is a message passing interface specification used for parallel
computing).
Like DHCP, NFS, SSH (password less logins) and you are ready to experiment
MPI programs plus some other parallel applications.
The setup of ParallelKnoppix is not very secure as the live CD password
both for a user and super user (root) are publicly known any one who has
some knowledge of ParallelKnoppix will get access to the ParallelKnoppix
Cluster. Actually the ease of setup is obtained by some compromising some
security. As there is a trade off between ease of use and security.
What is PXE boot?
PXE boot is an acronym for Preboot Execution Environment boot. PXE is a
technology that is used to boot a PC remotely through a network. PXE is
supported by the system BIOS and the network interface card need to be PXE
compliant.
What to do if your NIC is not PXE compliant?
You have to put ether boot images or burn a cd using the
images.ROM-o-matic.net dynamically generates Etherboot ROM images.
http://rom-o-matic.net/
Downloading ParrallelKnoppix
ISO file download
FTP exact link
http://pareto.uab.es/mcreel/ParallelKnoppix/parallelknoppix.iso
HTTP exact link
ftp://volcano.uab.es/pub/parallelknoppix.iso
MD5SUM download
http://pareto.uab.es/mcreel/ParallelKnoppix/parallelknoppix-2004-12-16.iso.md5
Check the home page http://pareto.uab.es/mcreel/ParallelKnoppix/ if the
above links expires
After downloading the ISO images, check the MD5 checksums for the ISO
images to ensure that your download was successful. Do this by running the
md5sum program from a shell prompt against your ISO images and comparing
the values returned against the md5 file (link is below for download). The
following illustrates the correct syntax for the md5sum command.
md5sum "isofilename"
In the above command, replace "isofilename" with the correct file name.
If you are for some reason not using Linux, then use the md5Summer a
Windows MD5sum generator, below is the link.
http://www.md5summer.com/
Note: writing the ISOs to CD requires a program such as cdrecord.
How it works?
There is a nice tutorial full of step by step screen shots of the
configuration process below is the link to the tutorial.
Parallel Knoppix tutorial html version
http://pareto.uab.es/mcreel/ParallelKnoppix/Tutorial/Tutorial.html
Parallel Knoppix tutorial pdf version
http://pareto.uab.es/wp/2004/62604.pdf
If you exported your CD Rom to the nodes it will easily accommodate 50
nodes but not more than 50 nodes are tested. I actually tested only 5
nodes my self.
What to do if multiple DHCP is running?
"If using this at a university (like I do), you're likely to
encounter the existence of an official DHCP server, and possibly a PXE
server. When you try to boot the nodes using the terminal server, the
nodes will often boot from the pre-existing PXE server, and they will
often get their IP addresses from the official server, not the DHCP server
running on the computer that was booted from the ParallelKnoppix CD. The
solution I have so far is to physically disconnect the computers to be
used as nodes from the pre-existing PXE and/or DHCP servers, or else to
get help from the administrators to temporarily disable those servers. If
anyone knows a more elegant solution, I'd like to hear about it. I think
it involves messing around with miniroot.gz, and using rom-o-matic to
create the PXE boot ROM. Too horrible for further contemplation..., at
least for me." --> from http://pareto.uab.es/mcreel/ParallelKnoppix/
How it works (summary)
The ParrallelKnoppix Live CD is used to boot a master node. On the booted master
node a script is executed which sets up a DHCP server, to share a common
working directory to all nodes using NFS, public keys are generated for
SSH to work properly (password less logins) needed for LAM. After the DHCP
master node is running the slave nodes are booted using PXE
boot. After the successful booting the sample directory of programs is
pasted to the NFS shared common directory and parallel programs are
executed in parallel on multiple PCs.
My experience
I am an undergraduate student of computer science and I was given a
project to solve a mathematical problem using MPI in parallel computing
lab. I chooses ParallelKnoppix as an alternate to demonstrate my MPI
program in Linux environment. The master node is booted using
the ParallelKnoppix CD some time during booting it will ask you the resolution
just enter "6" because it is the maximum resolution mode
supported. My master node was booted I run Setup ParallelKnoppix script by
K>ParallelKnoopixx>Setup ParallelKnoppix (see the above tutorial). After
the script has created DHCP server I turned on my slave nodes and let them
boot using PXE. After that all the nodes are successfully booted.
I copy my program to the "parallel_knoppix_working" directory and then
using a terminal I run my mpi program in parallel that's it.
For compilation I use
mpicc myprogram.c -o myprogram.bin
For execution I use
mpirun C myprogram.bin
Conclusion
"The ParallelKnoppix CD provides a very simple and rapid means of
setting up a cluster of heterogeneous PCs of the IA-32 architecture. It is
not intended to provide a stable cluster for multiple users, rather is a
tool for rapid creation of a cluster for individual use. The CD itself is
personalizable, and the configuration and working files can be re-used
over time, so it can provide a long-term solution for an individual
user." From ParallelKnoppix Tutorial By Michael Creel
References