# Docker Follow these instructions to set up and run our provided Docker image. ## Set Up Docker Engine and Docker Compose You'll need to install Docker Engine on your development system. Note that while **Docker Engine** is free to use, **Docker Desktop** may require you to purchase a license. See the [Docker Engine Server installation instructions](https://docs.docker.com/engine/install/#server) for details. To build and run this workload inside a Docker Container, ensure you have Docker Compose installed on your machine. If you don't have this tool installed, consult the official [Docker Compose installation documentation](https://docs.docker.com/compose/install/linux/#install-the-plugin-manually). ```bash DOCKER_CONFIG=${DOCKER_CONFIG:-$HOME/.docker} mkdir -p $DOCKER_CONFIG/cli-plugins curl -SL https://github.com/docker/compose/releases/download/v2.7.0/docker-compose-linux-x86_64 -o $DOCKER_CONFIG/cli-plugins/docker-compose chmod +x $DOCKER_CONFIG/cli-plugins/docker-compose docker compose version ``` ## Set Up Docker Image Build or Pull the provided docker images. ```bash cd docker docker compose build ``` OR ```bash docker pull intel/ai-tools:tlt-0.5.0 docker pull intel/ai-tools:tlt-devel-0.5.0 docker pull intel/ai-tools:tlt-dist-0.5.0 docker pull intel/ai-tools:tlt-dist-devel-0.5.0 ``` ## Use Docker Image Utilize the TLT CLI without installation by using the provided docker image and docker compose. ```bash docker compose run tlt-prod # OR docker compose run tlt-prod tlt --help ``` ## Kubernetes ### 1. Install Helm - Install [Helm](https://helm.sh/docs/intro/install/) ```bash curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 && \ chmod 700 get_helm.sh && \ ./get_helm.sh ``` ### 2. Setting up Training Operator Install the standalone operator from GitHub or use a pre-existing Kubeflow configuration. ```bash kubectl apply -k "github.com/kubeflow/training-operator/manifests/overlays/standalone" ``` OR ```bash helm repo add cowboysysop https://cowboysysop.github.io/charts/ helm install cowboysysop/training-operator ``` ### 3. Deploy TLT Distributed Job For more customization information, see the chart [README](./docker/chart/README.md) ```bash export NAMESPACE=kubeflow helm install --namespace ${NAMESPACE} --set ... tlt-distributed ./docker/chart ``` ### 4. View To view your workflow progress ```bash kubectl get -o yaml mpijob tf-tlt-distributed -n ${NAMESPACE} ``` OR ```bash kubectl logs tf-tlt-distributed-launcher -n ${NAMESPACE} ```