Skip to content

Commit

Permalink
Disconnected (#17)
Browse files Browse the repository at this point in the history
Files for restricted network installation and proxy configuration
  • Loading branch information
vchintal authored May 6, 2020
1 parent 21a107d commit 091063a
Show file tree
Hide file tree
Showing 16 changed files with 611 additions and 126 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,4 @@ downloads
install-dir
**/*.yml.git
**/*.yml.orig
**/pull-secret*.json
Binary file added .images/virtual-switch-final.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added .images/virtual-switch.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
82 changes: 71 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,13 @@
# OCP4 on VMware vSphere UPI Automation

The goal of this repo is to make deploying and redeploying a new OpenShift v4 cluster a snap. Using the same repo and with minor tweaks, it can be applied to any version of OpenShift higher than the current version of 4.3.
The goal of this repo is to make deploying and redeploying a new OpenShift v4 cluster a snap. Using the same repo and with minor tweaks, it can be applied to any version of OpenShift higher than the current version of 4.4.

As it stands right now, the repo works for several installation usecases:
* DHCP with OVA template
* DHCP with PXE boot (needs helper node)
* Static IPs for nodes (lack of isolated network to let helper run DHCP server)
* w/o Cluster-wide Proxy (HTTP and SSL/TLS with certs supported)
* Restricted network

> This repo is most ideal for Home Lab and Proof-of-Concept scenarios. Having said that, if prerequistes (below) can be met and if the vCenter service account can be locked down to access only certain resources and perform only certain actions, the same repo can then be used for DEV or higher environments. Refer to this [link](https://vmware.github.io/vsphere-storage-for-kubernetes/documentation/vcp-roles.html) for more details on required permissions for a vCenter service account.
Expand All @@ -9,19 +16,16 @@ The goal of this repo is to make deploying and redeploying a new OpenShift v4 cl
1. vSphere ESXi and vCenter 6.7 installed. For vCenter 6.5 please see a cautionary note below:
2. A datacenter created with a vSphere host added to it, a datastore exists and has adequate capacity
3. The playbook(s) assumes you are running a [helper node](https://github.com/RedHatOfficial/ocp4-helpernode) running in the same network to provide all the necessary services such as [DHCP/DNS/HAProxy as LB]. Also, the MAC addresses for the machines should match between helper repo and this. If not using the helper node, the minimum expectation is that the webserver and tftp server (for PXE boot) are running on the same external host, which we will then treat as a helper node.
* The necessary services such as [DNS/LB(Load Balancer] must be up and running before this repo can be used
* This repo works in environments where :
* DHCP is enabled: Use vSphere OVA template or use PXE boot
* DHCP is disabled: Use Static IPs with CoreOS ISO file
4. Ansible (preferably latest) with **Python 3** on the machine where this repo is cloned
4. The necessary services such as [DNS/LB(Load Balancer] must be up and running before this repo can be used
5. Ansible (preferably latest) with **Python 3** on the machine where this repo is cloned. Before you install Ansible, install the `epel-release`, run `yum -y install epel-release`

> For vSphere 6.5, the files relating to interaction with VMware/vCenter such as [this](roles/dhcp_ova/tasks/main.yml) ***may*** need to have `vmware_deploy_ovf` module to include [`cluster`](https://docs.ansible.com/ansible/latest/modules/vmware_deploy_ovf_module.html#parameter-cluster) and [`resource-pool`](https://docs.ansible.com/ansible/latest/modules/vmware_deploy_ovf_module.html#parameter-resource_pool) parameters and their values set to work correctly.
> For vSphere 6.5, the files relating to interaction with VMware/vCenter such as [this](roles/dhcp_ova/tasks/main.yml) ***may*** need to have `vmware_deploy_ovf` module to include [`cluster`](https://docs.ansible.com/ansible/latest/modules/vmware_deploy_ovf_module.html#parameter-cluster), [`resource-pool`](https://docs.ansible.com/ansible/latest/modules/vmware_deploy_ovf_module.html#parameter-resource_pool) parameters and their values set to work correctly.
## Automatic generation of ignition and other supporting files

### Prerequisites
> Pre-populated entries in **group_vars/all.yml** are ready to be used unless you need to customize further
1. Get the ***pull secret*** from [here](https://cloud.redhat.com/OpenShift/install/vsphere/user-provisioned). Update [group_vars/all.yml](group_vars/all.yml) file on the line with `pull_secret` by providing the entire pull secret as a single line replacing the provided/incomplete pull secret
> Pre-populated entries in **group_vars/all.yml** are ready to be used unless you need to customize further. Any updates described below refer to [group_vars/all.yml](group_vars/all.yml) unless otherwise specified.
1. Get the ***pull secret*** from [here](https://cloud.redhat.com/OpenShift/install/vsphere/user-provisioned). Update the file on the line with `pull_secret` by providing the entire pull secret as a single line replacing the provided/incomplete pull secret
2. Get the vCenter details:
1. IP address
2. Service account username (can be the same as admin)
Expand All @@ -35,9 +39,36 @@ The goal of this repo is to make deploying and redeploying a new OpenShift v4 cl
1. base domain *(pre-populated with **example.com**)*
2. cluster name *(pre-populated with **ocp4**)*
5. HTTP URL of the ***bootstrap.ign*** file *(pre-populated with a example config pointing to helper node)*
6. Update the inventory file: **staging** and under the `webservers.hosts` entry, use one of two options below :
6. Update the inventory file: **staging** under the `webservers.hosts` entry, use one of two options below :
1. **localhost** : if the `ansible-playbook` is being run on the same host as the webserver that would eventually host bootstrap.ign file
2. the IP address or FQDN of the machine that would run the webserver.
7. Furnish any proxy details with the section like below. If `proxy.enabled` is set to `False` anything related to the proxy wouldn't be picked up.
```
proxy:
enabled: true
http_proxy: http://helper.ocp4.example.com:3129
https_proxy: http://helper.ocp4.example.com:3129
no_proxy: example.com
cert_content: |
-----BEGIN CERTIFICATE-----
<certficate content>
-----END CERTIFICATE-----
```
8. When doing the restrcited network install and following instructions from [restricted.md](restricted.md), furnish details related to the registry with a section like below. If `registry.enabled` is set to `False` anything related to the `registry` wouldn't be picked up.
```
registry:
enabled: true
product_repo: openshift-release-dev
product_release_name: ocp-release
product_release_version: 4.4.0-x86_64
username: ansible
password: ansible
email: user@awesome.org
cert_content:
host: helper.ocp4.example.com
port: 5000
repo: ocp4/openshift4
```

> The step **#5** needn't exist at the time of running the setup/installation step, so provide an accurate guess of where and at what context path **bootstrap.ign** will eventually be served
Expand Down Expand Up @@ -100,6 +131,11 @@ ansible-playbook -i staging dhcp_pxe.yml
```sh
ansible-playbook -i staging static_ips.yml
```
#### Option 4: DHCP + use of OVA template in a Restricted Network
```sh
# Refer to restricted.md file for more details
ansible-playbook -i staging restricted_ova.yml
```

#### Miscellaneous
* If vCenter folder already exists with the template because you set the vCenter the last time you ran the ansible playbook but want a fresh deployment of VMs **after** you have erased all the existing VMs in the folder, append the following to the command you chose in the above step
Expand Down Expand Up @@ -132,7 +168,8 @@ ansible-playbook -i staging static_ips.yml
If everything goes well you should be able to log into all of the machines using the following command:

```sh
ssh -i ~/.ssh/ocp4 core@<IP_ADDRESS_OF_BOOTSTRAP_NODE>
# Assuming you are able to resolve bootstrap.ocp4.example.com on this machine
ssh -i ~/.ssh/ocp4 core@bootstrap.ocp4.example.com
```

Once logged in, on **bootstrap** node run the following command to understand if/how the masters are (being) setup:
Expand All @@ -152,3 +189,26 @@ export PATH=$(pwd)/bin:$PATH
oc whoami
oc get co
```
### Debugging

To check if the proxy information has been picked up:
```sh
# On Master
cat /etc/systemd/system/machine-config-daemon-host.service.d/10-default-env.conf

# On Bootstrap
cat /etc/systemd/system.conf.d/10-default-env.conf
```
To check if the registry information has been picked up:
```sh
# On Master or Bootstrap
cat /etc/containers/registries.conf
```
To check if your certs have been picked up:
```sh
# On Master
cat /etc/pki/ca-trust/source/anchors/openshift-config-user-ca-bundle.crt

# On Bootstrap
cat /etc/pki/ca-trust/source/anchors/ca.crt
```
5 changes: 0 additions & 5 deletions ansible.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,3 @@
fact_caching = jsonfile
fact_caching_connection = /tmp
host_key_checking = False
remote_user = root
ask_pass = True

[privilege_escalation]
become_ask_pass = True
28 changes: 24 additions & 4 deletions group_vars/all.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
---
helper_vm_ip: 192.168.86.180
bootstrap_ignition_url: "http://{{helper_vm_ip}}:8080/ignition/bootstrap.ign"
config:
Expand All @@ -18,8 +17,8 @@ vcenter:
vm_power_state: poweredon
templateName: rhcos-vmware
download:
clients_url: https://mirror.openshift.com/pub/openshift-v4/clients/ocp/latest
dependencies_url: https://mirror.openshift.com/pub/openshift-v4/x86_64/dependencies/rhcos/latest/latest
clients_url: https://mirror.openshift.com/pub/openshift-v4/x86_64/clients/ocp/4.4.3/
dependencies_url: https://mirror.openshift.com/pub/openshift-v4/x86_64/dependencies/rhcos/4.4/latest/
govc: https://github.com/vmware/govmomi/releases/download/v0.22.1/govc_linux_amd64.gz
bootstrap_vms:
- { name : "bootstrap", mac: "00:50:56:a8:aa:a1", ip: "192.168.86.181"}
Expand All @@ -34,4 +33,25 @@ worker_vms:
static_ip:
gateway: 192.168.86.1
netmask: 255.255.255.0
network_interface_name: ens192
network_interface_name: ens192
proxy:
enabled: true
http_proxy: http://helper.ocp4.example.com:3129
https_proxy: http://helper.ocp4.example.com:3129
no_proxy: example.com
cert_content: |
-----BEGIN CERTIFICATE-----
<certficate content>
-----END CERTIFICATE-----
registry:
enabled: true
product_repo: openshift-release-dev
product_release_name: ocp-release
product_release_version: 4.4.0-x86_64
username: ansible
password: ansible
email: user@awesome.org
cert_content:
host: registry.ocp4.example.com
port: 5000
repo: ocp4/openshift4
166 changes: 166 additions & 0 deletions restricted.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,166 @@
# Installation in a Restricted Network

Installation in an restricted (network) environment is going to be different. In such a setting, the base cluster (bootstrap, masters[0,3], workers[0,3]) won't have open access to the internet. The only access this core infrastucture will be allowed to have, is to a registry on a node/VM that will mirror the contents of the installation repos hosted on quay.io.

This documentation will guide you in using this repo to setup this registry for installation in such as restricted network.

## Prerequisites
0. Familiarity with this repo and a thorough reading of [README](README.md)
1. Prepare a RHEL 8/Fedora VM or reuse `helper` node as registry host
* Run `yum install -y podman httpd httpd-tools` when the VM is connected to internet
2. The `helper` is the `bastion` host and as such the installation msut be run on the `helper`

## (Optional) Network isolation for OpenShift VMs + registry in vCenter
> This section is meant for a lab environment, to practice a disconnected install. The subnets and IP addresses used below are shown only as an illustration.
### [Step 1] Create a Standard Network Port Group
1. Right click on vSphere host 🠪 Configure 🠪 Networking 🠪 Virtual Switches
2. Click on `ADD NETWORKING` button on the page (top right hand corner)
3. Select `Virtual Machine Port Group for a Standard Switch` and click `NEXT`
4. Select `New standard switch` (with defaults) and click `NEXT`
5. Click `NEXT` for Step 3
6. Click `OK` for the warning that there are no active physical network adapters
7. Give a name for the port-group and choose a number between 0-4095 for VLAN ID and click `NEXT`
8. Click `FINISH` on the final screen

When all done your setting should resemble somewhat like this image with the new `default` virtual switch.

[![](.images/virtual-switch.png)](.images/virtual-switch.png)

### [Step 2] Convert helper into a bastion host
1. Right click on the `helper` VM and click on `Edit Settings`
2. Click on the `ADD NEW DEVICE` (top right hand corner) when in the tab `Virtual Hardware`
3. Choose `Network Adapter` and when its added, click on `Browse` under the drop-down for network, choose the newly added port-group and then click on `OK`
4. SSH'ing into helper and using `ifconfig` determine the name of the new NIC. In my homelab, its `ens224`.
* Assuming you assigned a static IP address to the first NIC `ens192`, copy `ifcfg-ens192` in `/etc/sysconfig/network-scripts` and save it as `ifcfg-ens224` in the same folder.
* Edit the file `ifcfg-ens224` and ensure that the IP assigned is on a different subnet
> In my homelab, `ens192` was in `192.168.86.0/24` subnet with GATEWAY pointing to 192.168.86.1 and `ens224` was in `192.168.87.0/24` subnet with GATWAY pointing at 192.168.87.1
5. Restart the network with `systemctl restart NetworkManager`, a quick `ifconfig` or `nmcli device show ens224` should show the IP address picked up by the new NIC.

### [Step 3] Create a new VM for registry or reuse helper

#### If creating a new VM for registry (not re-using helper):
1. Ensure that VM is setup, *connected to internet* and #2 of prerequisites above is run
2. Assign it as hostname similar to `registry.ocp4.example.com`
3. Create a `ifcfg-ens192` file under `/etc/sysconfig/network-scripts`, for reference my file looks like this :
```sh
TYPE="Ethernet"
PROXY_METHOD="none"
BROWSER_ONLY="no"
BOOTPROTO="dhcp"
DEFROUTE="yes"
IPV4_FAILURE_FATAL="no"
IPV6INIT="yes"
IPV6_AUTOCONF="yes"
IPV6_DEFROUTE="yes"
IPV6_FAILURE_FATAL="no"
IPV6_ADDR_GEN_MODE="stable-privacy"
NAME="ens192"
DEVICE="ens192"
ONBOOT="yes"
IPV6_PRIVACY="no"
```

### [Step 4] Re-run helper playbook

In the helper `vars.yml` file ensure that all IP addresses (helper + bootstrap+ masters + workers) now belong to the new subnet `192.168.87.0/24`, that includes changing `helper.ipaddr` and `helper.networkifacename` to the new network adpater settings.

#### If creating a new VM for registry (not re-using helper)
Make accomdations for registry node: `registry.ocp4.example.com` by changing the helper's DNS and DHCP config files as shown:
1. Add a section for registry in helper's `vars.yml` file, as shown below. The `macaddr` should reflect the MAC address assigned to `ens192` adapter:
```
registry:
name: "registry"
ipaddr: "192.168.87.188"
macaddr: "00:50:56:a8:4b:4f"
```
2. Add the following line to `templates/dhcpd.conf.j2` under the Static entries (for example, below the line for bootstrap)
```
host {{ registry.name }} { hardware ethernet {{ registry.macaddr }}; fixed-address {{ registry.ipaddr }}; }
```
3. Add the following line to `templates/zonefile.j2` (for example, below the line for bootstrap)
```
; Create entry for the registry host
{{ registry.name }} IN A {{ registry.ipaddr }}
;
```
4. Add the following line to `templates/reverse.j2` (for example, below the line for bootstrap)
```
{{ registry.ipaddr.split('.')[3] }} IN PTR {{ registry.name }}.{{ dns.clusterid }}.{{ dns.domain }}.
;
```

Now that helper is all set with is configuration, lets re-run the playbook and when it goes to success, reboot `registry.ocp4.example.com` so that it could pickup its IP address via DHCP.

## Run Ansible Automation

### Configurations

Modify `staging` file to look like below:
```
all:
hosts:
localhost:
ansible_connection: local
children:
webservers:
hosts:
localhost:
registries:
hosts:
registry.ocp4.example.com:
ansible_ssh_user: root
ansible_ssh_pass: <password for ease of installation>
```
> If reusing the helper the hostname under registries would be `localhost` and the credentials underneath removed as this repo is intented to be run on helper node
In `ansible.cfg` have the following as the content, as we will be running this as `root` user on helper node.
```
[defaults]
fact_caching = jsonfile
fact_caching_connection = /tmp
host_key_checking = False
remote_user = root
```
In [group_vars/all.yml](group_vars/all.yml)'s registry dict, with rest being optional, the following must be changed:
* All IPs should now reflect the new subnet including
* helper_vm_ip (the new IP obtained under the new subnet)
* All IPs for bootstrap, masters, workers
* static_ip.gateway
* `registry.host` should be pointed to the IP or FQDN of the host mentioned in the previous step. If reusing the helper then use `helper.ocp4.example.com` else use (for example) `registry.ocp4.example.com`
* `registry.product_release_version` must be updated to the latest version of the container image. _(Use [documentation links](#documentation-links))_
* `vcenter.network` with the name of the new virtual switch port-group as we want all the new VMs land on the newly created virtual switch

### Installation in a restricted network

Now that helper, registry and the automation configs are all set, lets run the installation with the command:

```sh
# If vCenter folders exist
ansible-playbook --flush-cache -i staging restricted_ova.yml -e vcenter_preqs_met=true

# If vCenter folders DONT exist
ansible-playbook --flush-cache -i staging restricted_ova.yml
```

The final network topology should somewhat like the image below:
[![](.images/virtual-switch-final.png)](.images/virtual-switch-final.png)

## Final Check

To check if the registry information has been picked up run and command below on either kind of nodes or check the decoded contents of secret `pull-secret` in `openshift-config` when the cluster is operational
```sh
# On Master or Bootstrap
cat /etc/containers/registries.conf
```

### Things to watch out for
1. The OLM is broken on the restricted install, see #4 link below
2. You have to figure out how to get traffic into the cluster, relying on the DNS of helper won't help as it is on a different subnet with no internet access. I use `dnsmasq` to route any traffic to `example.com` domain to public/accessible IP of the helper node


## Documentation Links
1. [Create a mirror registry for installation in a restricted network](https://docs.openshift.com/container-platform/4.4/installing/install_config/installing-restricted-networks-preparations.html)
2. [Installing a cluster on vSphere in a restricted network](https://docs.openshift.com/container-platform/4.4/installing/installing_vsphere/installing-restricted-networks-vsphere.html)
3. https://www.openshift.com/blog/openshift-4-2-disconnected-install
4. [Using Operator Lifecycle Manager on restricted networks](https://docs.openshift.com/container-platform/4.4/operators/olm-restricted-networks.html)
Loading

0 comments on commit 091063a

Please sign in to comment.