File size: 2,115 Bytes
a01ef8c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
# Distributed Training

Here are instructions for using distributed/multinode Training with Intel® Transfer Learning Tool.

## Prerequisites

- Participating nodes should have Intel® oneAPI Base Toolkit installed. Verify the files under `/opt/intel/oneapi`
- Participating nodes should have passwordless SSH setup. Instructions to set up are given below.

### Passwordless SSH setup

- Use an existing (or create an) SSH key pair.

    - Check under `~/.ssh` and see if they exist. If present, make sure they have default names `(id_rsa.pub id_rsa)` and they don't have any passphrase.

    - To remove passphrase, type `ssh-keygen -p [-P old_passphrase] [-N new_passphrase] [-f keyfile]` by replacing `new_passphrase` with a blank space.

- How to create SSH key pair:

    - Get to your .ssh directory `cd ~/.ssh` (if this gives you an error, change the permissions: `chmod u+x ~/.ssh`)

    - Run the command: `ssh-keygen -t rsa`

    - The first prompt will ask you what you want to call your key files `(id_rsa.pub id_rsa)`. Press `<enter>` to use the default key names.

    - The second prompt will ask for passphrase. Do not enter any passphrase, just press `<enter>`.

- Locate the two ssh key pair files in your `.ssh` directory (`id_rsa.pub`, `id_rsa`):

    - Open the Public Key in an editor like vi/vim/nano/pico (this is the `.pub` file)

    - The ending of the public key may say `<your_idsid>@<hostname.domain>`, edit this file to omit the `"@<hostname.domain>"` at the end. The result will be your `idsid` only.

    - Create a file in your .ssh directory called `authorized_keys`

    - Paste your entire public key into this file

    - Make sure your new ssh key pair files AND `authorized_keys` files are read-write only for yourself with no permissions for anyone else `(chmod 600 file1 file2 file3)`

- Test the SSH `ssh <ip_or_hostname.domain>`

IMPORTANT NOTE: You have to make sure the `authorized_keys` file exists on all of the target systems that will participate in running the workload (in your local home dir in your `.ssh` directory) with contents of public key inside as well.