Skip to content

Commit

Permalink
Add CAPI+CAPM3 wf to multi-conductor experiment
Browse files Browse the repository at this point in the history
  • Loading branch information
mquhuy committed Nov 23, 2023
1 parent b007217 commit 9a70d5a
Show file tree
Hide file tree
Showing 34 changed files with 1,435 additions and 90 deletions.
3 changes: 3 additions & 0 deletions Support/Multitenancy/Multiple-Ironic-conductors/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,6 @@ macaddrs
uuids
sushy-tools-conf/*
bmc-*.yaml
ironic.env
ironic_logs/*
track.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
#!/bin/bash
set -e
trap 'trap - SIGTERM && kill -- -'$$'' SIGINT SIGTERM EXIT
__dir__=$(realpath "$(dirname "$0")")
# shellcheck disable=SC1091
. ./config.sh
# This is temporarily required since https://review.opendev.org/c/openstack/sushy-tools/+/875366 has not been merged.
sudo ./vm-setup.sh
./install-tools.sh
./configure-minikube.sh
sudo ./handle-images.sh
./build-sushy-tools-image.sh
./generate_unique_nodes.sh
./start_containers.sh
./start-minikube.sh
./build-api-server-container-image.sh

./install-ironic.sh
./install-bmo.sh

python create_nodes_v3.py

export CLUSTER_TOPOLOGY=true
clusterctl init --infrastructure=metal3
# kubectl apply -f capim-modified.yaml
yq ".spec.replicas = ${N_APISERVER_PODS}" apiserver-deployments.yaml | kubectl apply -f -
./generate-certificates.sh
# Wait for apiserver pod to exists
sleep 120

./create-clusters.sh
13 changes: 13 additions & 0 deletions Support/Multitenancy/Multiple-Ironic-conductors/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,19 @@ Now, if you open another terminal and run `kubectl -n metal3 get BMH --watch`, y

Just like before, all of the steps can be ran at once by running the `./Init-environment-v2.sh` script. This script also respects configuration in `config.sh`.

# Multiple ironics - full setup

With BMO already working, we can now proceed to making the multiple ironic conductor and fake ipa work with CAPI and CAPM3, i.e. we will aim to "create" clusters with these fake nodes. Since we do not have any nodes to install the k8s apiserver onto, we will attempt to install the apiserver directly on top of the management cluster, using the great research and experiment that was done by our colleague Lennart Jern, which can be read in full [here](https://github.com/metal3-io/metal3-io.github.io/blob/0592e636bb10b1659437790b38f85cc49c552239/_posts/2023-05-17-Scaling_part_2.md)

In short, for this story to work, you will need to install `kubeadm` and `clustctl` on your system. To simulate the `etcd` server, we added the script `start_fake_etcd.sh` into the equation.

All the setup steps can be run at once with the script `Init-environment-v3.sh`. After that, each time we run the script `create-cluster.sh`, a new BMH man ifest will be applied, and a new 1-node cluster will be created (the 1 node is, of course, coming with 1 kcp object, 1 `Machine` object, and 1 `Metal3Machine` object as usual).

Compared to Lennart's setup, ours has a couple of differences and notes:
- Our BMO doesn't run in test mode. Instead, we use `fake-ipa` to "trick" `ironic` to think that it is talking with real nodes.
- We don't expose the apiservers using the domain `test-kube-apiserver.NAMESPACE.svc.cluster.local` (in fact, we still do, but it doesn't seem to expose anything). Instead, we use the ClusterIP ip of the apiserver service.
- We also bump into the issue of lacking resources due to apiservers taking up too much, so the number of nodes/clusters we can simulate will not be too high. (So far, we have not been able to try running these apiservers on external VMs yet.) Another way to solve this issue might be to come up with some sort of apiserver simulation, the kind of things we already did with `fake-ipa`.

# Requirements

This study was conducted on a VM with the following specs:
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: capim-deployment
namespace: default
spec:
replicas: 2
selector:
matchLabels:
app: capim
strategy:
type: Recreate
template:
metadata:
labels:
app: capim
spec:
containers:
- image: 172.22.0.1:5000/localimages/capim
imagePullPolicy: Always
name: capim
env:
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
name: apiserver
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
#!/bin/bash
#
__dir__=$(realpath $(dirname $0))
IMAGE_NAME="172.22.0.1:5000/localimages/capim"

if [[ ${1:-""} == "-f" ]]; then
sudo podman rmi "${IMAGE_NAME}"
kubectl delete -f capim-modified.yaml
fi

if [[ $(sudo podman images | grep ${IMAGE_NAME}) != "" ]]; then
sudo podman push --tls-verify=false "${IMAGE_NAME}"
exit 0
fi
CAPI_DIR="/tmp/cluster-api"
if [[ ! -d "${CAPI_DIR}" ]]; then
git clone https://github.com/kubernetes-sigs/cluster-api.git "${CAPI_DIR}"
fi

cd "${CAPI_DIR}"

INMEMORY_DIR="${CAPI_DIR}/test/infrastructure/inmemory"

cp "${__dir__}/main.go" "${INMEMORY_DIR}/main.go"

cd "${INMEMORY_DIR}" || exit

sudo podman build --build-arg=builder_image=docker.io/library/golang:1.20.8 --build-arg=goproxy=https://proxy.golang.org,direct --build-arg=ARCH=amd64 --build-arg=ldflags="-X 'sigs.k8s.io/cluster-api/version.buildDate=2023-10-10T11:47:30Z' -X 'sigs.k8s.io/cluster-api/version.gitCommit=8ba3f47b053da8bbf63cf407c930a2ee10bfd754' -X 'sigs.k8s.io/cluster-api/version.gitTreeState=dirty' -X 'sigs.k8s.io/cluster-api/version.gitMajor=1' -X 'sigs.k8s.io/cluster-api/version.gitMinor=0' -X 'sigs.k8s.io/cluster-api/version.gitVersion=v1.0.0-4041-8ba3f47b053da8-dirty' -X 'sigs.k8s.io/cluster-api/version.gitReleaseCommit=e09ed61cc9ba8bd37b0760291c833b4da744a985'" ../../.. -t "${IMAGE_NAME}" --file Dockerfile

sudo podman push --tls-verify=false "${IMAGE_NAME}"
Original file line number Diff line number Diff line change
@@ -1,10 +1,19 @@
#!/bin/bash
#
SUSHYTOOLS_DIR="$HOME/sushy-tools"
IMAGE_NAME="127.0.0.1:5000/localimages/sushy-tools"
if [[ ${1:-""} == "-f" ]]; then
sudo podman rmi "${IMAGE_NAME}"
fi

if [[ $(sudo podman images | grep ${IMAGE_NAME}) != "" ]]; then
sudo podman push --tls-verify=false "${IMAGE_NAME}"
exit 0
fi
SUSHYTOOLS_DIR="/tmp/sushy-tools"
rm -rf "$SUSHYTOOLS_DIR"
git clone https://opendev.org/openstack/sushy-tools.git "$SUSHYTOOLS_DIR"
cd "$SUSHYTOOLS_DIR" || exit
git fetch https://review.opendev.org/openstack/sushy-tools refs/changes/66/875366/35 && git cherry-pick FETCH_HEAD
git fetch https://review.opendev.org/openstack/sushy-tools refs/changes/66/875366/36 && git cherry-pick FETCH_HEAD

pip3 install build
python3 -m build
Expand Down Expand Up @@ -43,5 +52,5 @@ RUN mkdir -p /root/sushy
CMD ["sushy-emulator", "-i", "::", "--config", "/root/sushy/conf.py"]
EOF

sudo podman build -t 127.0.0.1:5000/localimages/sushy-tools .
rm -rf "$SUSHYTOOLS_DIR"
sudo podman build -t "${IMAGE_NAME}" .
sudo podman push --tls-verify=false "${IMAGE_NAME}"
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
apiVersion: v1
kind: Pod
metadata:
name: apiserver
labels:
app: manager
spec:
containers:
- image: 172.22.0.1:5000/localimages/capim
imagePullPolicy: Always
name: capim
env:
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
restartPolicy: Always
22 changes: 13 additions & 9 deletions Support/Multitenancy/Multiple-Ironic-conductors/clean.sh
Original file line number Diff line number Diff line change
Expand Up @@ -31,14 +31,18 @@ minikube stop
minikube delete --all --purge

# Stop and delete containers
containers=("ironic-ipa-downloader" "ironic" "keepalived" "registry" "ironic-client" "fake-ipa" "openstack-client" "httpd-infra")
for i in $(seq 1 "$N_SUSHY"); do
containers+=("sushy-tools-$i")
done
for container in "${containers[@]}"; do
echo "Deleting the container: $container"
sudo podman stop "$container" &>/dev/null
sudo podman rm "$container" &>/dev/null
declare -a running_containers=($(sudo podman ps --all --format json | jq -r '.[].Names[0]'))
echo ${running_containers[0]}
declare -a containers=("ipa-downloader" "ironic" "keepalived" "registry" "ironic-client" "openstack-client" "httpd-infra")

for container in "${running_containers[@]}"; do
if [[ "${containers[@]}" =~ "${container}" || "${container}" =~ "sushy-tools-"* || "${container}" =~ "fake-ipa-"* ]]; then
echo "Deleting the container: ${container}"
sudo podman stop "$container" &>/dev/null
sudo podman rm "$container" &>/dev/null
fi
done

rm -rf macaddrs uuids node.json nodes.json batch.json
rm -rf bmc-*.yaml

rm -rf macaddrs uuids node.json nodes.json batch.json in-memory-development.yaml sushy-tools-conf ironic.env
19 changes: 16 additions & 3 deletions Support/Multitenancy/Multiple-Ironic-conductors/config.sh
Original file line number Diff line number Diff line change
@@ -1,6 +1,19 @@
#!/bin/bash
#
export N_NODES=1000
export N_SUSHY=30
# Put the endpoints of different ironics, separated by spaces
export IRONIC_ENDPOINTS="172.22.0.2 172.22.0.3 172.22.0.4 172.22.0.5"
export N_SUSHY=120
export N_FAKE_IPA=12
export N_IRONICS=15
export N_APISERVER_PODS=15
# export N_NODES=50
# export N_SUSHY=2
# export N_FAKE_IPA=2
# export N_IRONICS=3

# Translating N_IRONICS to IRONIC_ENDPOINTS. Don't change this part
IRONIC_ENDPOINTS="172.22.0.2"
for i in $(seq 2 $N_IRONICS); do
IRONIC_ENDPOINTS="${IRONIC_ENDPOINTS} 172.22.0.$(( i + 1 ))"
done
export IRONIC_ENDPOINTS

Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
#!/bin/bash
set -e
minikube config set driver kvm2
minikube config set memory 4096
# minikube config set memory 4096
sudo usermod --append --groups libvirt "$(whoami)"
while /bin/true; do
minikube_error=0
minikube start --insecure-registry 172.22.0.1:5000 || minikube_error=1
minikube start --insecure-registry 172.22.0.1:5000 --memory 50000 --cpus 20 || minikube_error=1
if [[ $minikube_error -eq 0 ]]; then
break
fi
Expand Down
159 changes: 159 additions & 0 deletions Support/Multitenancy/Multiple-Ironic-conductors/create-clusters.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,159 @@
#!/bin/bash
#

source ./config.sh
CLUSTER_TEMPLATE=manifests/cluster-template.yaml
export CLUSTER_APIENDPOINT_PORT="6443"
export IMAGE_CHECKSUM="97830b21ed272a3d854615beb54cf004"
export IMAGE_CHECKSUM_TYPE="md5"
export IMAGE_FORMAT="raw"
export KUBERNETES_VERSION="v1.26.0"
export WORKERS_KUBEADM_EXTRA_CONFIG=""
export WORKER_MACHINE_COUNT="0"
export NODE_DRAIN_TIMEOUT="60s"
export CTLPLANE_KUBEADM_EXTRA_CONFIG=""

retry_curl() {
endpoint=$1
timeout=${2:-5}
while true; do
ret=$(curl -s $endpoint 2>/dev/null)
if [ $? -eq 0 ]; then
echo $ret
return
else
sleep $timeout
fi
done
}

create_cluster() {
bmh_index="${1}"
cluster="test${bmh_index}"
namespace="${cluster}"
nodename="${cluster}"
fake_ipa_port=$(( 9901 + (( $bmh_index % ${N_FAKE_IPA} )) ))
api_server_idx=$(( $bmh_index % ${N_APISERVER_PODS} ))
api_server_port=$(( 3333 + ${api_server_idx} ))

export IMAGE_URL="http://192.168.111.1:${fake_ipa_port}/images/rhcos-ootpa-latest.qcow2"

api_server_name=$(kubectl get pods -l app=capim -o jsonpath="{.items[${api_server_idx}].metadata.name}")

kubectl port-forward pod/${api_server_name} ${api_server_port}:3333 2>/dev/null&

echo "Creating cluster ${cluster} in namespace ${namespace}"
kubectl create namespace "${namespace}"
kubectl -n "${namespace}" apply -f bmc-${nodename}.yaml

caKeyEncoded=$(cat /tmp/ca.key | base64 -w 0)
caCertEncoded=$(cat /tmp/ca.crt | base64 -w 0)
etcdKeyEncoded=$(cat /tmp/etcd.key | base64 -w 0)
etcdCertEncoded=$(cat /tmp/etcd.crt | base64 -w 0)

while true; do
cluster_endpoints=$(retry_curl "localhost:${api_server_port}/register?resource=${namespace}/${cluster}&caKey=${caKeyEncoded}&caCert=${caCertEncoded}&etcdKey=${etcdKeyEncoded}&etcdCert=${etcdCertEncoded}")
if jq -e . >/dev/null 2>&1 <<<"$cluster_endpoints"; then
break
else
sleep 2
fi
done
echo $cluster_endpoints
host=$(echo ${cluster_endpoints} | jq -r ".Host")
port=$(echo ${cluster_endpoints} | jq -r ".Port")

cat <<EOF > "/tmp/${cluster}-ca-secrets.yaml"
apiVersion: v1
kind: Secret
metadata:
labels:
cluster.x-k8s.io/cluster-name: ${cluster}
name: ${cluster}-ca
namespace: ${namespace}
type: kubernetes.io/tls
data:
tls.crt: ${caCertEncoded}
tls.key: ${caKeyEncoded}
EOF

kubectl -n ${namespace} apply -f /tmp/${cluster}-ca-secrets.yaml

cat <<EOF > "/tmp/${cluster}-etcd-secrets.yaml"
apiVersion: v1
kind: Secret
metadata:
labels:
cluster.x-k8s.io/cluster-name: ${cluster}
name: ${cluster}-etcd
namespace: ${namespace}
type: kubernetes.io/tls
data:
tls.crt: ${etcdCertEncoded}
tls.key: ${etcdKeyEncoded}
EOF

kubectl -n ${namespace} apply -f /tmp/${cluster}-etcd-secrets.yaml

# Generate metal3 cluster
export CLUSTER_APIENDPOINT_HOST="${host}"
export CLUSTER_APIENDPOINT_PORT="${port}"
echo "Generating cluster ${cluster} with clusterctl"
clusterctl generate cluster "${cluster}" \
--from "${CLUSTER_TEMPLATE}" \
--target-namespace "${namespace}" > /tmp/${cluster}-cluster.yaml
kubectl apply -f /tmp/${cluster}-cluster.yaml

sleep 10

wait_for_resource() {
resource=$1
jsonpath=${2:-"{.items[0].metadata.name}"}
while true; do
kubectl -n "${namespace}" get "${resource}" -o jsonpath="${jsonpath}" 2> /dev/null
if [ $? -eq 0 ]; then
return
fi
sleep 2
done
}

bmh_name=$(wait_for_resource "bmh")
metal3machine=$(wait_for_resource "m3m")
machine=$(wait_for_resource "machine")

providerID="metal3://$namespace/$bmh_name/$metal3machine"
echo "Done generating cluster ${cluster} with clusterctl"
retry_curl "localhost:${api_server_port}/updateNode?resource=${namespace}/${cluster}&nodeName=${machine}&providerID=${providerID}"
}

START_NUM=${1:-1}

for i in $(seq $START_NUM $N_NODES); do
namespace="test${i}"
if [[ $(kubectl get ns | grep "${namespace}") != "" ]]; then
echo "ERROR: Namespace ${namespace} exists. Skip creating cluster"
continue
fi
create_cluster "${i}"
done

# Wait for all BMHs to be available. Clusters should be more or less ready by then.
desired_states=("available" "provisioning" "provisioned")
for i in $(seq $START_NUM $N_NODES); do
namespace="test${i}"
bmh_name="$(kubectl -n ${namespace} get bmh -o jsonpath='{.items[0].metadata.name}')"
echo "Waiting for BMH ${bmh_name} to become available."
while true; do
bmh_state="$(kubectl -n ${namespace} get bmh -o jsonpath='{.items[0].status.provisioning.state}')"
if [[ "${desired_states[@]}" =~ "${bmh_state}" ]]; then
break
fi
sleep 3
done
done

# Run describe for all clusters
for i in $(seq $START_NUM $N_NODES); do
clusterctl -n "test${i}" describe cluster "test${i}"
done
Loading

0 comments on commit 9a70d5a

Please sign in to comment.