Mudanças entre as edições de "Minicluster:Mpich"

De WikiLICC
Ir para: navegação, pesquisa
m
m (Running on One Machine)
Linha 75: Linha 75:
 
Once you've created this file on one of your nodes, you can manually create it on each of the other nodes, or see the [[Cluster Time-saving Tricks]] page for tips on how to script this process.  Make sure that the secret word matches on all of the worker nodes.
 
Once you've created this file on one of your nodes, you can manually create it on each of the other nodes, or see the [[Cluster Time-saving Tricks]] page for tips on how to script this process.  Make sure that the secret word matches on all of the worker nodes.
  
== Running on One Machine ==
+
== Rodando em uma máquina ==
It's wise to do a test MPI run on one machine before setting it up to run on multiple machines.
+
Para rodar um programa MPI em uma máquina.
  
=== Starting a Daemon ===
+
=== Inicie um Daemon ===
Choose one of your worker nodes to test on.  SSH into that machine and, as root, start up an mpd (multi-purpose daemon) to run in the background with
+
Escolha uma da máquinas escravo.  SSH nesta máquina, como root, inicie mpd (multi-purpose daemon) para rodar em background com
  
 
:<code>mpd --daemon</code>
 
:<code>mpd --daemon</code>
  
Next, type <code>mpdtrace -l</code> to verify that mpd has been starting on this hostYou should see your host name with an underscore with a number (this is the MPD id) and host IP address returned to you.
+
Verifique que o mpd foi iniciado nesta máquina:
 +
:<code>mpdtrace -l</code>
 +
Deveria aparecer algo como o Nome.da.maquina_PID (endereco.ip)
 +
  cell100.matrix_54419 (192.168.0.100)
  
=== Running an MPI Program ===
+
=== Rodando um programa MPI ===
Once you've got the mpd daemon up and running, it's time to open your favorite text editor and type up an MPI program.  A "hello world" type program is ideal for this, and I'll be borrowing one from the [http://bccd.cs.uni.edu Bootable Cluster CD project].  For this, you'll want to be on a user account, not root.  Follow the instructions on [[Creating and Compiling an MPI Program]].
+
Conecte-se a uma máquina escravo usando ssh
 
+
ssh usuario@cell100
Once you've successfully compiled it, you're ready to run it. Still as the user account (not as root), first run
+
Abra o editor de texto e digite o programa "hello world" [[Creating and Compiling an MPI Program]].
 
+
vi hellompi.f90
:<code>mpiexec ./hello.out</code>
+
Rode o programa com um processo
 
+
mpiexec ./hellompi
You should see a message printed out from the server process, but that's it. Without specifying a number of processes, MPICH automatically only uses one process.  Run it again and specify multiple processes:
+
Rode novamente usando mais processos (e um processador - talvez 2 se for dual-core ?)
 
+
  mpiexec -np 5 ./hellompi
:<code>mpiexec -np 5 ./hello.out</code>
+
Deveria aparecer
 
+
[dago@cell100]$ mpiexec -np 4 ./hellompi
This time you should see something like this:
+
Sou o processo            0  de um total de            4  rodando em cell100.matrix
 
+
Sou o processo            2  de um total de            4  rodando em cell100.matrix
<pre>kwanous@eagle:~/mpi$ mpiexec -np 5 ./hello.out
+
  Sou o processo            de um total de            4  rodando em cell100.matrix
Hello MPI from the server process!
+
  Sou o processo            de um total de            4 rodando em cell100.matrix
Hello MPI!
+
onde <code>cell100.matrix</code> é o nome da máquina (todas no mesmo nó ainda). Para rodar em mais máquinas, primeiro desligue o mpd
  mesg from 1 of 5 on eagle
+
[usuario]$ mpdallexit
Hello MPI!
 
  mesg from 2 of 5 on eagle
 
Hello MPI!
 
  mesg from 3 of 5 on eagle
 
Hello MPI!
 
  mesg from 4 of 5 on eagle</pre>
 
 
 
Notice by the host name (in my case, <code>eagle</code>, that all of these processes are still running on the same node. To set this up across multiple nodes, first kill the mpd daemon with <code>mpdallexit</code>, then continue on to
 
  
 
== Configuring Worker Nodes to use Root's MPD Daemon ==
 
== Configuring Worker Nodes to use Root's MPD Daemon ==

Edição das 06h51min de 17 de junho de 2010

Parece que na distribuição padrão não está presente.

Instalação=

Máquina mestre

Como possui acesso a internet via segunda placa de rede, basta usar

yum install mpich2

Máquina escravo

Não possui acesso a internet. Existem várias possibilidades, porém estou tentando o seguinte:

  • Baixar os pacotes .rpm necessários na máquina mestre e copiá-los para a máquina escravo. [ok]
tcl-8.5.7-5.fc13.x86_64.rpm
environment-modules-3.2.7b-7.fc13.x86_64.rpm
mpich2-1.2.1p1-2.fc13.x86_64.rpm 
mpich2-devel-1.2.1p1-2.fc13.x86_64.rpm
  • Importar as keys necessárias para instalar os pacotes (não instalou direito usando apenas rpm):
rpm --import /etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-x86_64

ou melhor ainda, editar /etc/yum.conf para não reclamar sobre signature files

vi /etc/yum.conf
..
gpgcheck=0
..
  • Instalar usando yum
yum localinstall tcl-8.5.7-5.fc13.x86_64.rpm --disablerepo=fedora --disablerepo=updates

Configuração

This is part three of a multi-part tutorial on installing and configuring MPICH2. The full tutorial includes

MPICH without Torque Functionality

In this paradigm, users (and sysadmins) are responsible for much more overhead, and many more possibilities for misconfiguration occur. Unless you are planning not to have a scheduler and queue, I highly recommend setting up MPICH with Torque Functionality instead.

Creating Mpd.conf Files

Mpd stands for "multi-purpose daemon." MPICH2 separates process communication from process management by having an mpd process manage any MPI processes. The mpd's need a little extra customization that wasn't necessary in MPICH1. To begin with, each user that will (potentially) use MPICH needs to have a mpd.conf file.

User Accounts

Each user needs a .mpd.conf file in his/her home directory, and must only be readable and writable by that user. The file should look like this:

MPD_SECRET_WORD=yours3cr3tw0rd
MPD_USE_ROOT_MPD=yes
  • MPD_SECRET_WORD can be unique to each user, but doesn't have to be
  • MPD_USE_ROOT_MPD specifies that users will not start up their own mpd daemons, but will rely upon attaching to one already running under the root account

You can run a simple script to create this file for all of your users. First, create the file as shown above in root's home directory (/root). Make it only readable/writable by it's owner, root, by running chmod 600 /root/.mpd.conf. Then to make a copy for each of the user accounts, run

for x in `ls /shared/home/`; do rsync -plarv /root/.mpd.conf /shared/home/$x/; chown $x:users /shared/home/$x/.mpd.conf; done

Of course, replace all instances of /shared/home/ with the location where your users' home directories are stored, and replace users with whatever group your users are in.

Future Users

In order to make sure new users automatically have this file created for them, create a .mpd.conf file in /etc/skel/ on the machine you create user accounts from. Make sure it has a secret word and root mpd flag as shown above. Then, run

chmod 600 /etc/skel/.mpd.conf

This will make sure the file is created with only the user having read/write permissions.

Root's .mpd.conf

Similar to the users, root needs to have a mpd.conf file on each one of the worker nodes. For root, rather than being stored in a home directory, this file is located at /etc/mpd.conf. This file only needs one line:

MPD_SECRETWORD=yours3cr3tw0rd

You can replace yours3cr3tw0rd with whatever you'd like. Then, make sure it's only readable/writable by root:

chmod 600 /etc/mpd.conf

Once you've created this file on one of your nodes, you can manually create it on each of the other nodes, or see the Cluster Time-saving Tricks page for tips on how to script this process. Make sure that the secret word matches on all of the worker nodes.

Rodando em uma máquina

Para rodar um programa MPI em uma máquina.

Inicie um Daemon

Escolha uma da máquinas escravo. SSH nesta máquina, como root, inicie mpd (multi-purpose daemon) para rodar em background com

mpd --daemon

Verifique que o mpd foi iniciado nesta máquina:

mpdtrace -l

Deveria aparecer algo como o Nome.da.maquina_PID (endereco.ip)

cell100.matrix_54419 (192.168.0.100)

Rodando um programa MPI

Conecte-se a uma máquina escravo usando ssh

ssh usuario@cell100

Abra o editor de texto e digite o programa "hello world" Creating and Compiling an MPI Program.

vi hellompi.f90

Rode o programa com um processo

mpiexec ./hellompi

Rode novamente usando mais processos (e um processador - talvez 2 se for dual-core ?)

mpiexec -np 5 ./hellompi

Deveria aparecer

[dago@cell100]$ mpiexec -np 4 ./hellompi
Sou o processo            0  de um total de            4  rodando em cell100.matrix
Sou o processo            2  de um total de            4  rodando em cell100.matrix
Sou o processo            1  de um total de            4  rodando em cell100.matrix
Sou o processo            3  de um total de            4  rodando em cell100.matrix

onde cell100.matrix é o nome da máquina (todas no mesmo nó ainda). Para rodar em mais máquinas, primeiro desligue o mpd

[usuario]$ mpdallexit

Configuring Worker Nodes to use Root's MPD Daemon

Starting up an mpd daemon for each user each time they log in is doable, but that requires an extra step of complexity for your users to understand. Plus, they'll need to remember to start up daemons on multiple machines when they run programs that require multiple processors (not just multiple processes).

An easier paradigm to follow is to start a single mpd daemon on each of the worker nodes and have users' programs attach to that daemon. Continue on to MPICH: Starting a Global MPD Ring to implement this.