Skip to content

Commit

Permalink
Update new structure
Browse files Browse the repository at this point in the history
  • Loading branch information
mquhuy committed Jul 26, 2023
1 parent 583fb22 commit 6c473c4
Show file tree
Hide file tree
Showing 34 changed files with 872 additions and 66 deletions.
8 changes: 4 additions & 4 deletions Support/Multitenancy/ironic-env/v1/Init-environment.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,11 @@ __dir__=$(realpath "$(dirname "$0")")
. ./config.sh
# This is temporarily required since https://review.opendev.org/c/openstack/sushy-tools/+/875366 has not been merged.
./build-sushy-tools-image.sh
sudo ./01-vm-setup.sh
./02-configure-minikube.sh
sudo ./vm-setup.sh
./configure-minikube.sh
sudo ./handle-images.sh
./generate_unique_nodes.sh
./start_containers.sh
./04-start-minikube.sh
./05-apply-manifests.sh
./start-minikube.sh
./install-ironic.sh
python create_and_inspect_nodes.py
114 changes: 82 additions & 32 deletions Support/Multitenancy/ironic-env/v1/README.md
Original file line number Diff line number Diff line change
@@ -1,60 +1,110 @@
# Multiple ironics setup

For a shorter summary of what this study is about, you can check out the `SUMMARY.md` file. This wall of text will try to explain everything is a more detailed manner.

## Purposes
- Following metal3-dev-env workflow, currently only one `ironic-conductor` could be installed in one cluster. Despite ironic being very good in handling parallel traffics, having only one of it means that we can only have so many nodes being provisioned and managed, until ironic stops accepting new connections (or new nodes). Being able to scale `ironic-conductor` is, therefore crucial to improve metal3 and BMO performance.
- Assuming that we could make multiple conductors scenario work, testing it would be an issue. A massive number (say, 1000) of machines, even virtual ones, is generally not something available for everyone. For that reason, we also introduce the new [ipa simulating tool](https://review.opendev.org/c/openstack/sushy-tools/+/875366), a.k.a `fake-ipa`, which allows simulating inspection and provision for multiple baremetal nodes, without the need of real hardwares. In a glance, the tool tries to handle all the traffics that would normally be handled by the `ironic python agent`s (notice the plural form), without needing to run inside real (or fake) machines. So, `ironic` would think it would be talking to multiple real ipas, instead of a single fake one.

- This setup is a part of the study to deploy multiple instances of `ironic-conductor` to increase provisioning capacity.
- It takes into use the new [ipa simulating tool](https://review.opendev.org/c/openstack/sushy-tools/+/875366), which allows simulating inspection and provision for multiple baremetal nodes, without the need of real hardwares.
- One purpose of this study is to investigate if the current `ironic` pod could be divided into smaller parts, and if `ironic` is able to be scaled.
## Steps
### Build the new container images
At the time of writing, the `fake-ipa` commit has not been merged to `sushy-tools` repo yet, so to kick start this setup, we need to build it first. That inclues cloning the repo, cherry-picking the commit, adding a Dockerfile, and building a new container image. All of these things are handled automatically with the script `build-sushy-tools-image.sh`, which you can find inside the same directory as this file.

## Requirements
```bash
./build-sushy-tools-image.sh
```
The result image will be tagged as `127.0.0.1:5000/localimages/sushy-tools`, and it will be used to run both `sushy-tools` and `fake-ipa` containers (with different entry points).

- Machine: `4c / 16gb / 100gb`
- OS: `CentOS9-20220330`
### Setup the host machine
In the next step, we want to setup some stuff in the host machine to prepare for the simulation. It includes installing needed packages, setting some new networks, firewalls, etc., and all are handled by the script `vm-setup.sh`. If you're familiar with `metal3-dev-env`, you will notice that most of these steps are copied from there, except for the addition of a new tool called `helm`. We will discuss more about helm in one of the following steps, but for now, let's run the script first (you might want to run it with `sudo`):

## Configuration
```bash
sudo ./vm-setup.sh
```
While we're at it, let's run the `./configure-minikube.sh` script as well. This is basically meant to make `minikube` accept insecured http connections, as we will later config it to work with an unsecured local registries.

- Configs can be set in `config.sh`:
```bash
./configure-minikube.sh
```
`handle-images.sh` will download needed images and push them to local registries, and run the `ironic_tls_setup.sh`, which will configure the certificates needed for TLS `ironic`. TLS is needed here since we will need to run `ironic` with `mariadb`, which is in turns a requirement for multiple-ironic scenario.

- `N_NODES`: Number of nodes to create and inspect
- `N_SUSHY`: Number of `sushy-tools` containers to deploy
- `IRONIC_ENDPOINTS`: The endpoints of ironics to use, separated by spaces.
The number of endpoints put in here equals the number of ironics that will be used.
```bash
sudo ./handle-images.sh
```

Example config:
### "Generate nodes"
`Fake-ipa` needs to have the information about the nodes that it will represent added to the config file before it gets started, and naturally we will, later on, need to provide `ironic` information of the same "nodes", so that the two sides accept one another. Hence, it's a good idea to generate those nodes beforehands and store the information in a file. We do that by running the script `generate_unique_nodes.sh`. This script reads the `N_NODES` value, which represents the number of nodes, from environment, and defaults to `100` if there's no such value.

```bash
N_NODES=1000
N_SUSHY=10
IRONIC_ENDPOINTS="172.22.0.2 172.22.0.3 172.22.0.4 172.22.0.5"
./generate_unique_nodes.sh
```

This config means that there will be, in total, 1000 (fake) nodes created, of which each roughly 100 nodes will point to one of the 10 `sushy-tools` containers.
The "nodes" are stored in a file called `nodes.json` in the pwd. This file will be taken into use in some next steps.

## Results
### Start-up sushy-tools and fake-ipa containers
Now that the nodes information is available, we can start up the `fake-ipa` container, and together with it, `sushy-tools` as well. What we need to do is to write a `conf.py` file with the configuration needed by either of these containers, and mount the directory to the container, so that the binaries inside could read it.

- The `ironic` pod used in `metal3-dev-env`, which consists of several containers, was splited into smaller pods that run separatedly as followed:
```bash
./start_containers
```
You may notice, from the script, that we use the envvar `N_SUSHY` to determine the number of `sushy-tools` containers. Reasons for why there can be more than 1 `sushy-tools` container is due to a speed limit in `sushy-tools`, which only allows it to handle around 100 nodes. To increase the number of nodes, we currently overcome the limit by increasing the number of `sushy-tools` and configure the nodes' endpoints accordingly.

- First pod: consists of `ironic` and `ironic-httpd` containers.
- Second pod: consists of `dnsmasq` and `ironic-inspector` containers.
- Third pod: consists of `mariadb` container.
### Start minikube and install ironic

The `ironic` entity can be scaled up by deploying more instances of the first pod (a.k.a. `ironic` and `ironic-httpd`)
With nodes information in place, we can go on to start up minikube, and then install `ironic` onto the minikube cluster.

- Ironic cannot recover from `mariadb` failure:
```bash
./start-minikube.sh
```
In this script, we set up the minikube to work with `ironicendpoint`, just like in `metal3-dev-env`, as well as open some ports in the firewall, to make sure the traffic can flow from/to needed entities.

```bash
baremetal node list
./install-ironic.sh
```
Besides `ironic`, we also install `cert-manager` (for TLS), and create an ironic client called `baremetal` to manage `ironic` from terminal.

Notice that to install `ironic`, we use the `helm` tool that we mentioned earlier. You can read more about it in its [official documentation](https://helm.sh/docs/). The helm chart we use to represent `ironic` is inside the directory `./ironic`. While we won't explain this chart in great details, here's some main points you may want to know:

(pymysql.err.ProgrammingError) (1146, "Table 'ironic.nodes' doesn't exist")
- The `ironic` pod used in `metal3-dev-env`, which consists of several containers, was splited into smaller pods that run separatedly as followed:
- `ironic` pod: consists of `ironic` and `ironic-httpd` containers.
- `ironic-inspector` pod: consists of `dnsmasq` and `ironic-inspector` containers.
- `mariadb` pod: consists of `mariadb` container.

[SQL: SELECT nodes.created_at, nodes.updated_at, nodes.version, nodes.id, nodes.uuid, nodes.instance_uuid, nodes.name, nodes.chassis_id, nodes.power_state, nodes.provision_state, nodes.driver, nodes.conductor_group, nodes.maintenance, nodes.owner, nodes.l
essee, nodes.allocation_id
Each of the pods is deployed as a helm's `deployment`, which means we can scale them as we wish. However, `ironic` only supports scaling of the `ironic` component, while the `ironic-inspector` and db will have to be unique.

FROM nodes ORDER BY nodes.id ASC
This chart takes in the `sshKey` value to authenticate the `baremetal` client to connect to ironic, while the `ironicReplicas` value, which is a list of endpoints separated by spaces, determines how many `ironic` pods this deployment will have, and to what endpoints should we contact them. One nice feature from ironic is that we don't need to contact all of these `ironic` instances: since they share the same database, accessing any of them will be enough to query and control all the nodes.

LIMIT %(param_1)s]
### Create and inspect nodes

[parameters: {'param_1': 1000}]
To register and inspect all the nodes we created in the previous step, we will use a small python script called `create_and_inspect_nodes.py`. The reason python is chosen instead of bash was just to take advantage of the great parallelism management in python to manage several nodes at the same time (Otherwise it would take a lot longer than a few hours to finish the process with 1000 nodes).

(Background on this error at: https://sqlalche.me/e/14/f405) (HTTP 500)
```bash
python create_and_inspect_nodes.py
```

## All at once

All of the aforementioned scripts can be ran at once by running the `./Init-environment.sh`.
- Configs can be set in `config.sh`:
- `N_NODES`: Number of nodes to create and inspect
- `N_SUSHY`: Number of `sushy-tools` containers to deploy
- `IRONIC_ENDPOINTS`: The endpoints of ironics to use, separated by spaces.

As said, the number of endpoints put in `IRONIC_ENDPOINTS` equals the number of ironics that will be used.

### Example config
```bash
N_NODES=1000
N_SUSHY=10
IRONIC_ENDPOINTS="172.22.0.2 172.22.0.3 172.22.0.4 172.22.0.5"
```
This config means that there will be, in total, 1000 (fake) nodes created, of which each roughly 100 nodes will point to one of the 10 `sushy-tools` containers.

__NOTE__: To clean up everything, you can run the `./cleanup.sh` script.

## Requirements
This study was conducted on a VM with the following specs:
- CPUs: 20c
- RAM: 64Gb
- Hard disk: 750Gb
- OS: `CentOS9-20220330`
40 changes: 40 additions & 0 deletions Support/Multitenancy/ironic-env/v1/SUMMARY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# Multiple ironics setup

## Purposes

- This setup is a part of the study to deploy multiple instances of `ironic-conductor` to increase provisioning capacity.
- It takes into use the new [ipa simulating tool](https://review.opendev.org/c/openstack/sushy-tools/+/875366), which allows simulating inspection and provision for multiple baremetal nodes, without the need of real hardwares.
- One purpose of this study is to investigate if the current `ironic` pod could be divided into smaller parts, and if `ironic` is able to be scaled.

## Requirements

This study was conducted on a VM with the following specs:
- CPUs: 20c
- RAM: 64Gb
- Hard disk: 750Gb
- OS: `CentOS9-20220330`

## Configuration

- Configs can be set in `config.sh`:

- `N_NODES`: Number of nodes to create and inspect
- `N_SUSHY`: Number of `sushy-tools` containers to deploy
- `IRONIC_ENDPOINTS`: The endpoints of ironics to use, separated by spaces.
The number of endpoints put in here equals the number of ironics that will be used.

Example config:

```bash
N_NODES=1000
N_SUSHY=10
IRONIC_ENDPOINTS="172.22.0.2 172.22.0.3 172.22.0.4 172.22.0.5"
```
This config means that there will be, in total, 1000 (fake) nodes created, of which each roughly 100 nodes will point to one of the 10 `sushy-tools` containers.

## Results

- The `ironic` pod used in `metal3-dev-env`, which consists of several containers, was splited into smaller pods that run separatedly as followed:
- First pod: consists of `ironic` and `ironic-httpd` containers.
- Second pod: consists of `dnsmasq` and `ironic-inspector` containers.
- Third pod: consists of `mariadb` container.
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ SUSHYTOOLS_DIR="$HOME/sushy-tools"
rm -rf "$SUSHYTOOLS_DIR"
git clone https://opendev.org/openstack/sushy-tools.git "$SUSHYTOOLS_DIR"
cd "$SUSHYTOOLS_DIR" || exit
git fetch https://review.opendev.org/openstack/sushy-tools refs/changes/66/875366/21 && git cherry-pick FETCH_HEAD
git fetch https://review.opendev.org/openstack/sushy-tools refs/changes/66/875366/25 && git cherry-pick FETCH_HEAD

pip3 install build
python3 -m build
Expand Down
2 changes: 1 addition & 1 deletion Support/Multitenancy/ironic-env/v1/config.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,4 @@
export N_NODES=1000
export N_SUSHY=30
# Put the endpoints of different ironics, separated by spaces
export IRONIC_ENDPOINTS="172.22.0.2 172.22.0.3 172.22.0.4"
export IRONIC_ENDPOINTS="172.22.0.2 172.22.0.3 172.22.0.4 172.22.0.5"
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,3 @@ while /bin/true; do
sudo ip link delete virbr0 | true
done
minikube stop

Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ function generate_unique {

echo '[]' > nodes.json

for i in $(seq 1 "$N_NODES"); do
for i in $(seq 1 "${N_NODES:-100}"); do
uuid=$(generate_unique uuidgen uuids)
macaddr=$(generate_unique macgen macaddrs)
name="fake${i}"
Expand Down
7 changes: 3 additions & 4 deletions Support/Multitenancy/ironic-env/v1/handle-images.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,10 @@ N_NODES=${N_NODES:-1000}
REGISTRY_NAME="registry"
REGISTRY_PORT="5000"
IMAGE_NAMES=(
# "quay.io/metal3-io/ironic-python-agent"
# For now, sushy-tools needs to be compiled locally with https://review.opendev.org/c/openstack/sushy-tools/+/875366
# "quay.io/metal3-io/sushy-tools"
"quay.io/metal3-io/ironic-ipa-downloader"
# "quay.io/metal3-io/ironic:latest"
"quay.io/metal3-io/ironic:latest"
"quay.io/metal3-io/ironic-client"
"quay.io/metal3-io/keepalived:v0.2.0"
"quay.io/metal3-io/mariadb:latest"
Expand All @@ -35,8 +34,8 @@ for NAME in "${IMAGE_NAMES[@]}"; do
podman push --tls-verify=false 127.0.0.1:5000/localimages/"${NAME##*/}"
done

# This image was built earlier, but can only be pushed now, after the network was setup
podman push --tls-verify=false 127.0.0.1:5000/localimages/sushy-tools
podman push --tls-verify=false 127.0.0.1:5000/localimages/ironic:latest

__dir__=$(realpath "$(dirname "$0")")
"$__dir__/ironic_tls_setup.sh"
Expand All @@ -45,7 +44,7 @@ __dir__=$(realpath "$(dirname "$0")")
IRONIC_IMAGE="127.0.0.1:5000/localimages/ironic:latest"

# Run httpd container
sudo podman run -d --net host --name httpd-infra \
podman run -d --net host --name httpd-infra \
--pod infra-pod \
-v /opt/metal3-dev-env/ironic:/shared \
-e PROVISIONING_INTERFACE=provisioning \
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ done
for i in 8000 80 9999 6385 5050 6180 53 5000; do sudo firewall-cmd --zone=public --add-port=${i}/tcp; done
for i in 69 547 546 68 67 5353 6230 6231 6232 6233 6234 6235; do sudo firewall-cmd --zone=libvirt --add-port=${i}/udp; done

for i in $(seq 1 "${N_SUSHY:-5}"); do
for i in $(seq 1 "${N_SUSHY:-1}"); do
port=$(( 8000 + i ))
sudo firewall-cmd --zone=public --add-port=$port/tcp
sudo firewall-cmd --zone=libvirt --add-port=$port/tcp
Expand Down
5 changes: 2 additions & 3 deletions Support/Multitenancy/ironic-env/v1/start_containers.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#!/bin/bash
N_SUSHY=${N_SUSHY:-5}
N_SUSHY=${N_SUSHY:-1}
__dir__=$(realpath "$(dirname "$0")")
SUSHY_CONF_DIR="${__dir__}/sushy-tools-conf"
SUSHY_TOOLS_IMAGE="127.0.0.1:5000/localimages/sushy-tools"
Expand All @@ -10,8 +10,7 @@ ADVERTISE_PORT="9999"
API_URL="https://172.22.0.2:6385"
CALLBACK_URL="https://172.22.0.2:5050/v1/continue"

rm -rf "$SUSHY_CONF_DIR"
mkdir -p "$SUSHY_CONF_DIR"
rm -rf "$SUSHY_CONF_DIR" mkdir -p "$SUSHY_CONF_DIR"

mkdir -p "$SUSHY_CONF_DIR/ssh"

Expand Down
8 changes: 4 additions & 4 deletions Support/Multitenancy/ironic-env/v2/Init-environment.sh
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,13 @@ __dir__=$(realpath "$(dirname "$0")")
. ./config.sh
# This is temporarily required since https://review.opendev.org/c/openstack/sushy-tools/+/875366 has not been merged.
./build-sushy-tools-image.sh
sudo ./01-vm-setup.sh
./02-configure-minikube.sh
sudo ./vm-setup.sh
./configure-minikube.sh
sudo ./handle-images.sh
./generate_unique_nodes.sh
./start_containers.sh
./04-start-minikube.sh
./05-apply-manifests.sh
./start-minikube.sh
./install-ironic.sh
kubectl -n baremetal-operator-system wait --for=condition=available deployment/baremetal-operator-controller-manager --timeout=300s
kubectl create ns metal3
python create_nodes.py
Expand Down
Loading

0 comments on commit 6c473c4

Please sign in to comment.