How to deploy Openshift Origin OKD on Fedora 29 Server

How to deploy Openshift Origin OKD on Fedora 29 Server

openshift_okd

Test Environment

Fedora 29 installed 3 VM’s with DNS resolveable hostnames

What is Openshift Origin

Openshift origin is a layered system with underlying Docker container management platform and Kubernetes for Container orchestration. Openshift origin installation requires a good planning. We need to make some clear decisions before we go about setting up the cluster.

Here i am going to make the following decision to setup my environment.

  • Installing Openshift Origin on premise
  • Single host with etcd and master along with two nodes for hosting Pods
  • Using Fedora 29 linux (i.e RPM based installation method) for setting up the Cluster

For setting up the above environment we will need three Linux servers (i.e Fedora 29 servers here) with FQDN which are resolvable by DNS server. The minimum recommended CPU and memory requirement for setting these servers is actually atleast 8 GB RAM and 20 to 40 GB hard disk as per the following pre-requisites link.

I am currently setting up this one on my local machine with only 8 GB RAM which is really not even the minimum requirement. Here it took me atleast 5 – 6 hours to complete the cluster setup which might complete within an hour if you have the minimum required resources for each of these machines.

So, lets get started in setting up our Openshift Origin Cluster which is not that straight forward as it looks in the documentation.

Procedure

Step1: Install Fedora 29 server on three virtual machines

I have installed and configured my three virtual machines with a static IP address and a FQDN as below.

  • ocpmaster.stack.com
  • ocpnode.stack.com
  • ocpinfra.stack.com

As per the minimum hardware requirement we can setup the linux server on any of the following Linux OS flavours.

  • Base OS: Fedora 21, CentOS 7.4, Red Hat Enterprise Linux (RHEL) 7.4 or later with the “Minimal” installation option and the latest packages from the Extras channel, or RHEL Atomic Host 7.4.5 or later.

Note – I have tried setting up initially on Fedora 28 but the installation failed as it was unable to find some of the Openshift Origin client and node setup rpm’s in the repositories that are enabled.

Step2: Setup servers to access other servers without password (i.e SSH based access)

The Openshift origin installer requires a user that has access to all the hosts in our environment.

[root@ocpmaster ~]# ssh-keygen
[root@ocpmaster ~]# for host in ocpmaster.stack.com ocpnode.stack.com ocpinfra.stack.com; do ssh-copy-id -i ~/.ssh/id_rsa.pub $host; done

Step3: Install base packages

We will need to install the following base RPM packages on the master host as an initial setup.

[root@ocpmaster tmp]# yum -y install wget git net-tools bind-utils iptables-services bridge-utils bash-completion kexec-tools sos psacct
[root@ocpmaster tmp]# yum -y update

Once the above base packages are installed and all the packages are upgraded to the latest available version, lets reboot the ‘ocpmaster’ server.

Step4: Install packages required for RPM based installation method

Here i am using the RPM based installation method to setup the cluster. Openshift origin cluster setup is done by running a set of ansible playbooks, so Openshift origin installation requires a correct version of ansible to be installed for it to be successful.

Also, as ansible is purely based on Python we will also need to the correct version of Python to be already installed on our ocpmaster server to avoid any failures.

[root@ocpmaster ~]# yum install ansible python3-pyOpenSSL.noarch

Make sure we are installing the following versions of ansible, Python and its related packages as shown below.

  • Ansible – ver 2.6 and later
  • Python – ver 3 and later

Also, install the below RPM package on all the three servers to avoid failures.

[root@ocpmaster ~]# yum install python3-libsemanage.x86_64
[root@ocpnode ~]# yum install python3-libsemanage.x86_64
[root@ocpinfra ~]# yum install python3-libsemanage.x86_64

Step5: Clone the openshift ansible github repository which provides the required playbooks and configuration files

[root@fedmaster ~]# cd ~
[root@fedmaster ~]# pwd
/root
[root@fedmaster ~]# git clone https://github.com/openshift/openshift-ansible
Cloning into 'openshift-ansible'...
remote: Enumerating objects: 40, done.
remote: Counting objects: 100% (40/40), done.
remote: Compressing objects: 100% (37/37), done.
remote: Total 140971 (delta 13), reused 18 (delta 2), pack-reused 140931
Receiving objects: 100% (140971/140971), 38.29 MiB | 327.00 KiB/s, done.
Resolving deltas: 100% (88488/88488), done.

Checkout the v3.11 release from the repository

[root@fedmaster openshift-ansible]# git checkout release-3.11
Branch 'release-3.11' set up to track remote branch 'release-3.11' from 'origin'.
Switched to a new branch 'release-3.11'

Step6: Install Docker

As per the documentation, we need to install Docker on all the master and node servers which we are setting up the cluster on in our environment.

[root@ocpmaster openshift-ansible]# yum install docker

I have installed docker only on the master server as the pre-requisite playbook that we will run later will actually take care of this installation if docker is not installed already.

Enable and start the docker service once installed

[root@ocpmaster openshift-ansible]# systemctl enable docker
Created symlink /etc/systemd/system/multi-user.target.wants/docker.service ? /usr/lib/systemd/system/docker.service.
[root@ocpmaster openshift-ansible]# systemctl start docker
[root@ocpmaster openshift-ansible]# systemctl is-active docker
active

Step7: Configure the ansible inventory file for the cluster

Ansible inventory file contains the details about the hosts in our environment and also the Openshift origin cluster configuration details. I am using the following configuration for setting up my cluster.

[root@ocpmaster tmp]# cat /etc/ansible/hosts
# Create an OSEv3 group that contains the masters, nodes, and etcd groups
[OSEv3:children]
masters
nodes
etcd

# Set variables common for all OSEv3 hosts
[OSEv3:vars]
# SSH user, this user should allow ssh based auth without requiring a password
ansible_ssh_user=root
ansible_python_interpreter=/usr/bin/python3.7
openshift_enable_docker_excluder=False
openshift_enable_openshift_excluder=False
os_firewall_use_firewalld=True

# If ansible_ssh_user is not root, ansible_become must be set to true
#ansible_become=true

openshift_deployment_type=origin

# uncomment the following to enable htpasswd authentication; defaults to DenyAllPasswordIdentityProvider
#openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider'}]

# host group for masters
[masters]
ocpmaster.stack.com

# host group for etcd
[etcd]
ocpmaster.stack.com

# host group for nodes, includes region info
[nodes]
ocpmaster.stack.com openshift_node_group_name='node-config-master'
ocpnode.stack.com openshift_node_group_name='node-config-compute'
ocpinfra.stack.com openshift_node_group_name='node-config-infra'

So, as per the inventory file we are trying to setup a master, node and infra node servers with master API service and etcd service running on the master node. Also, we have some excluder variable to False which basically mean to upgrade docker and Openshift when new version are available. This can be setup to True so that these packages wont get upgrade when we do a full system upgrade. The deployment type is set to ‘origin’ as we are installing the Openshift origin package.

Step8: Running the RPM based installer – prerequisities playbook

The RPM based installer uses the Ansible RPM package installed to run the playbooks and and configuration files.

First, we need to run the pre-requisites playbook which installs the required packages if any and configures the container runtime.

[root@ocpmaster openshift-ansible]# ansible-playbook -i /etc/ansible/hosts playbooks/prerequisites.yml

Make sure you do not have any failures in the playbook run.

PLAY RECAP ***************************************************************************************************************************
localhost                  : ok=11   changed=0    unreachable=0    failed=0
ocpmaster.stack.com        : ok=65   changed=18   unreachable=0    failed=0
ocpnode.stack.com          : ok=44   changed=18   unreachable=0    failed=0
ocpinfra.stack.com   : ok=44   changed=18   unreachable=0    failed=0

Note – I was getting the following errors while running the playbook on Fedora 28 do to which i disabled the excluder packages to be installed.

TASK [openshift_excluder : Install openshift excluder - dnf] *************************************************************************
Sunday 09 June 2019  00:37:26 +0530 (0:00:00.166)       0:01:15.470 ***********
FAILED - RETRYING: Install openshift excluder - dnf (3 retries left).
FAILED - RETRYING: Install openshift excluder - dnf (3 retries left).
FAILED - RETRYING: Install openshift excluder - dnf (2 retries left).
FAILED - RETRYING: Install openshift excluder - dnf (2 retries left).
FAILED - RETRYING: Install openshift excluder - dnf (1 retries left).
FAILED - RETRYING: Install openshift excluder - dnf (1 retries left).
fatal: [ocpnode.stack.com]: FAILED! => {"attempts": 3, "changed": false, "failures": ["No package origin-excluder-3.11* available."], "msg": "Failed to install some of the specified packages", "rc": 1, "results": []}
fatal: [ocpmaster.stack.com]: FAILED! => {"attempts": 3, "changed": false, "failures": ["No package origin-excluder-3.11* available."], "msg": "Failed to install some of the specified packages", "rc": 1, "results": []}

As already mentioned you can remove those two excluder variable from inventory file if you are running it on Fedora 29 server.

Step9: Running the RPM based installer – deploy cluster playbook

Before running the deploy cluster playbook, let me show the  reason i installed libsemanage python package is to avoid the below error while running the playbook.

TASK [openshift_node : Setting sebool container_manage_cgroup] ***********************************************************************
Sunday 09 June 2019  01:03:38 +0530 (0:00:01.280)       0:02:59.199 ***********
fatal: [ocpnode.stack.com]: FAILED! => {"changed": false, "msg": "This module requires libsemanage-python support"}
fatal: [ocpmaster.stack.com]: FAILED! => {"changed": false, "msg": "This module requires libsemanage-python support"}

Please take note of the below errors also as these packages are currently not available in the Fedora 28 repository.
That is the reason i chose to setup the cluster on Fedora 29 rather than Fedora 28 even though they mention its a supported base OS package.

FAILED - RETRYING: Install node, clients, and conntrack packages (2 retries left).
FAILED - RETRYING: Install node, clients, and conntrack packages (1 retries left).
FAILED - RETRYING: Install node, clients, and conntrack packages (1 retries left).
fatal: [ocpnode.stack.com]: FAILED! => {"attempts": 3, "changed": false, "failures": ["No package origin-node-3.11* available.", "No package origin-clients-3.11* available."], "msg": "Failed to install some of the specified packages", "rc": 1, "results": []}
fatal: [ocpmaster.stack.com]: FAILED! => {"attempts": 3, "changed": false, "failures": ["No package origin-node-3.11* available.", "No package origin-clients-3.11* available."], "msg": "Failed to install some of the specified packages", "rc": 1, "results": []}

failed: [ocpnode.stack.com] (item={'name': 'origin-node-3.10*'}) => {"attempts": 3, "changed": false, "failures": ["No package origin-node-3.10* available."], "item": {"name": "origin-node-3.10*"}, "msg": "Failed to install some of the specified packages", "rc": 1, "results": []}
failed: [ocpmaster.stack.com] (item={'name': 'origin-node-3.10*'}) => {"attempts": 3, "changed": false, "failures": ["No package origin-node-3.10* available."], "item": {"name": "origin-node-3.10*"}, "msg": "Failed to install some of the specified packages", "rc": 1, "results": []}

Now, lets run the deploy cluster playbook.

[root@ocpmaster openshift-ansible]# ansible-playbook -i /etc/ansible/hosts playbooks/deploy_cluster.yml

Let the playbook run completely till it exits with success or failure as in the end it will show the step that you can execute from if the installation failes as shown below.

PLAY RECAP ***************************************************************************************************************************
localhost                  : ok=11   changed=0    unreachable=0    failed=0    skipped=5    rescued=0    ignored=0
ocpinfra.stack.com         : ok=98   changed=50   unreachable=0    failed=1    skipped=106  rescued=0    ignored=0
ocpmaster.stack.com        : ok=458  changed=209  unreachable=0    failed=1    skipped=574  rescued=0    ignored=0
ocpnode.stack.com          : ok=98   changed=50   unreachable=0    failed=1    skipped=106  rescued=0    ignored=0

INSTALLER STATUS *********************************************************************************************************************
Initialization              : Complete (0:00:27)
Health Check                : Complete (0:01:10)
Node Bootstrap Preparation  : Complete (0:55:13)
etcd Install                : Complete (0:01:18)
Master Install              : Complete (0:19:39)
Master Additional Install   : Complete (0:01:36)
Node Join                   : In Progress (0:44:42)
        This phase can be restarted by running: playbooks/openshift-node/join.yml

In my case the installation failed when it was trying to join the nodes to the cluster, so had to rerun the step as mentioned in the error.

[root@ocpmaster openshift-ansible]# ansible-playbook -i /etc/ansible/hosts playbooks/openshift-node/join.yml

Now, i was able to successfully install my Openshift Origin cluster.

...
TASK [Set Node Join 'Complete'] ******************************************************************************************************
Friday 28 June 2019  13:34:32 +0530 (0:00:00.122)       0:06:52.507 ***********
ok: [ocpmaster.stack.com]

PLAY RECAP ***************************************************************************************************************************
localhost                  : ok=11   changed=0    unreachable=0    failed=0    skipped=5    rescued=0    ignored=0
ocpinfra.stack.com         : ok=22   changed=2    unreachable=0    failed=0    skipped=80   rescued=0    ignored=0
ocpmaster.stack.com        : ok=54   changed=2    unreachable=0    failed=0    skipped=103  rescued=0    ignored=0
ocpnode.stack.com          : ok=22   changed=2    unreachable=0    failed=0    skipped=80   rescued=0    ignored=0

Step10: Validate the Openshift Origin Cluster installation

Run the following commands to make sure that your installation is successful.

[root@ocpmaster openshift-ansible]# oc get nodes
NAME                  STATUS    ROLES     AGE       VERSION
ocpinfra.stack.com    Ready     infra     2h        v1.10.0+d4cacc0
ocpmaster.stack.com   Ready     master    2h        v1.10.0+d4cacc0
ocpnode.stack.com     Ready     compute   2h        v1.10.0+d4cacc0

[root@ocpmaster openshift-ansible]# oc get pods --all-namespaces
NAMESPACE        NAME                                     READY     STATUS    RESTARTS   AGE
kube-system      master-api-ocpmaster.stack.com           1/1       Running   13         2h
kube-system      master-controllers-ocpmaster.stack.com   1/1       Running   11         2h
kube-system      master-etcd-ocpmaster.stack.com          1/1       Running   0          2h
openshift-node   sync-d97k7                               1/1       Running   1          2h
openshift-node   sync-h56xc                               1/1       Running   0          2h
openshift-node   sync-hmshx                               1/1       Running   0          2h
openshift-sdn    ovs-4tp5s                                1/1       Running   0          2h
openshift-sdn    ovs-94lfg                                1/1       Running   0          2h
openshift-sdn    ovs-frm7l                                1/1       Running   0          2h
openshift-sdn    sdn-7fvc7                                1/1       Running   5          2h
openshift-sdn    sdn-9kp95                                1/1       Running   0          2h
openshift-sdn    sdn-jgkvm                                1/1       Running   6          2h

With this you are all set to use your Cluster and work on setting up the projects in this PAAS environment.

Hope you enjoyed reading this article. Thank you..

2 COMMENTS

comments user
Chris

Thank you for concise steps for the installation, very helpful. But I am stuck at the cluster deployment with the error "Control plane didn't come up and become ready". I am unable to solve it so far. I am using Fedora 29 on my VMs.

comments user
novicejava1

HI Chris, Thanks for reading. I am assuming your prerequisites.yml step is completed successfully and you trying to run the deploy_cluster.yml step. Please make sure you have enough resources on the master node where the cluster is getting setup. Please help to share the complete log if possible along with the Play recap output at the end to look at. You can share it via https://gist.github.com/