-
Notifications
You must be signed in to change notification settings - Fork 1k
[Kubernetes] Add ability to setup high availability Kubernetes master on testbed server #2240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
15 commits
Select commit
Hold shift + click to select a range
df63fe5
First take at integration
isabelmsft b71a481
Synced vm_set/main.yml, edited k8s testbed readme, fixed formatting
isabelmsft ec6cb70
Adjust readme
isabelmsft 7cb9ae0
Add check to ensure k8s_root_path exists
isabelmsft b50701a
Fix k8s environment variable check and fix typo in readme
isabelmsft 2d948ee
Update README.testbed.k8s.Setup.md
isabelmsft 50a042c
Update README.testbed.k8s.Setup.md
isabelmsft 03ca4f2
Update README.testbed.k8s.Setup.md
isabelmsft 3910a6a
Move cloud-image utils installation to account for PR build fail and …
isabelmsft 21d2e17
Fix edge condition last host unreachable, fix PR dpkg lock build fail…
isabelmsft f935997
Fixed Azure Storage, specified Kubernetes version, fixed Ansible WARN…
isabelmsft 643c5fc
Address comments from PR
isabelmsft 88284de
Update README.testbed.k8s.Setup.md
isabelmsft ad3cc34
Fix indentation, change to inline comment
isabelmsft 5d3a372
Merge branch 'k8s-master-integrate' of https://github.com/isabelmsft/…
isabelmsft File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,157 @@ | ||
| # SONiC Kubernetes Design | ||
|
|
||
| This document describes the design to test Kubernetes features in SONiC. | ||
|
|
||
| ## Background | ||
|
|
||
| Each SONiC DUT is a worker node managed by a High Availability Kubernetes master. The High Availability Kubernetes master is composed of three master node machines and one load balancer machine. | ||
|
|
||
| By connecting each SONiC DUT to HA Kubernetes master, containers running in SONiC can be managed by the Kubernetes master. SONiC containers managed by the Kubernetes master are termed to be running in "Kubernetes mode" as opposed to the original "Local mode." | ||
|
|
||
| In Kubernetes mode, SONiC container properties are based on specifications defined in the associated Kubernetes manifest. A Kubernetes manifest is a file in the Kubernetes master that defines the Kubernetes object and container configurations. In our case, we use Kubernetes Daemonset objects. The Kubernetes Daemonset object ensures that each worker node is running exactly one container of the image specified in the Daemonset manifest file. | ||
|
|
||
| For example, in order to run SNMP and Telemetry containers in Kubernetes mode, we must have two manifests that define two Kubernetes Daemonset objects- one for each container running in "Kubernetes mode." | ||
|
|
||
| The following is a snippet of the Telemetry Daemonset manifest file that specifies the Kubernetes object type and container image: | ||
|
|
||
| ``` | ||
| apiVersion: apps/v1 | ||
| kind: DaemonSet | ||
| metadata: | ||
| name: telemetry-ds | ||
| spec: | ||
| template: | ||
| metadata: | ||
| labels: | ||
| name: telemetry | ||
| spec: | ||
| hostname: sonic | ||
| hostNetwork: true | ||
| containers: | ||
| - name: telemetry | ||
| image: sonicanalytics.azurecr.io/sonic-dockers/any/docker-sonic-telemetry:20200531 | ||
| tty: true | ||
| . | ||
| . | ||
| . | ||
| ``` | ||
|
|
||
|
|
||
| ## Topology Overview | ||
|
|
||
| In order to connect each SONiC DUT to a High Availability Kubernetes master, we need to set up the following topology: | ||
|  | ||
| - Each high availability master setup requires 4 new Linux KVMs running on a Testbed Server via bridged networking. | ||
| - 3 Linux KVMs to serve as 3-node high availability Kubernetes master | ||
| - 1 Linux KVM to serve as HAProxy Load Balancer node | ||
| - Each KVM has one management interface assigned an IP address reachable from SONiC DUT. | ||
| - HAProxy Load Balancer proxies requests to 3 backend Kubernetes master nodes. | ||
|
|
||
| Our setup meets Kubernetes Minimum Requirements to setup a High Available cluster. The Minimum Requirements are as follows: | ||
| - 2 GB or more of RAM per machine | ||
| - 2 CPUs or more per machine | ||
| - Full network connectivity between all machines in the cluster (public or private network) | ||
| - sudo privileges on all machines | ||
| - SSH access from one device to all nodes in the system | ||
|
|
||
| ## How to Setup High Availability Kubernetes Master | ||
|
|
||
| 1. Prepare Testbed Server and build and run `docker-sonic-mgmt` container as described [here](https://github.com/Azure/sonic-mgmt/blob/master/ansible/doc/README.testbed.Setup.md) | ||
| 2. Allocate 4 available IPs reachable from SONiC DUT. | ||
| 3. Update [`ansible/k8s-ubuntu`](../k8s-ubuntu) to include your 4 newly allocated IP addresses for the HA Kubernetes master and IP address of testbed server. | ||
|
|
||
| - We will walk through an example of setting up HA Kubernetes master set 1 on server 19 (STR-ACS-SERV-19). The following snippets are the relevant portions from [`ansible/k8s-ubuntu`](../k8s-ubuntu). | ||
|
|
||
| ``` | ||
| k8s_vm_host19: | ||
| hosts: | ||
| STR-ACS-SERV-19: | ||
| ansible_host: 10.251.0.101 | ||
| ``` | ||
| - Replace `ansible_host` value above with the IP address of the testbed server. | ||
|
|
||
| ``` | ||
| k8s_vms1_19: | ||
| hosts: | ||
| kvm19-1m1: | ||
| ansible_host: 10.250.0.2 | ||
| master: true | ||
| master_leader: true | ||
| kvm19-1m2: | ||
| ansible_host: 10.250.0.3 | ||
| master: true | ||
| master_member: true | ||
| kvm19-1m3: | ||
| ansible_host: 10.250.0.4 | ||
| master_member: true | ||
| master: true | ||
| kvm19-1ha: | ||
| ansible_host: 10.250.0.5 | ||
| haproxy: true | ||
| ``` | ||
|
|
||
| - Replace each `ansible_host` value with an IP address allocated in step 2. | ||
|
|
||
| - Take note of the group name `k8s_vms1_19`. At the bottom of [`ansible/k8s-ubuntu`](../k8s-ubuntu), make sure that `k8s_server_19` has its `host_var_file` and two `children` properly set: | ||
|
|
||
| ``` | ||
| k8s_server_19: | ||
| vars: | ||
| host_var_file: host_vars/STR-ACS-SERV-19.yml | ||
| children: | ||
| k8s_vm_host19: | ||
| k8s_vms1_19: | ||
| ``` | ||
|
|
||
| 4. Update the server network configuration for the Kubernetes VM management interfaces in [`ansible/host_vars/STR-ACS-SERV-19.yml`](../host_vars/STR-ACS-SERV-19.yml). | ||
| - `mgmt_gw`: ip of the gateway for the VM management interfaces | ||
| - `mgmt_prefixlen`: prefixlen for the management interfaces | ||
| 5. If necessary, set proxy in [`ansible/group_vars/all/env.yml`](../group_vars/all/env.yml) | ||
| 6. Update the testbed server credentials in [`ansible/group_vars/k8s_vm_host/creds.yml`](../group_vars/k8s_vm_host/creds.yml). | ||
| 7. If using Azure Storage to source Ubuntu 18.04 KVM image, set `k8s_vmimage_saskey` in [`ansible/vars/azure_storage.yml`](../vars/azure_storage.yml). | ||
| - To source image from public URL: download from [here](https://cloud-images.ubuntu.com/bionic/current/bionic-server-cloudimg-amd64.img). Then, convert img to qcow2 by running `qemu-img convert -f qcow2 bionic-server-cloudimg-amd64.img bionic-server-cloudimg-amd64.qcow2`. Store qcow2 image at the path `/home/azure/ubuntu-vm/images/bionic-server-cloudimg-amd64.qcow2` on your testbed server. | ||
| 8. From `docker-sonic-mgmt` container, `cd` into `sonic-mgmt/ansible` directory and run `./testbed-cli.sh -m k8s-ubuntu [additional OPTIONS] create-master <k8s-server-name> ~/.password` | ||
| - `k8s-server-name` corresponds to the group name used to describe the testbed server in the [`ansible/k8s-ubuntu`](../k8s-ubuntu) inventory file, of the form `k8s_server_{unit}`. | ||
| - Please note: `~/.password` is the ansible vault password file name/path. Ansible allows users to use ansible-vault to encrypt password files. By default, this shell script requires a password file. If you are not using ansible-vault, just create an empty file and pass the file name to the command line. The file name and location are created and maintained by the user. | ||
| - For HA Kubernetes master set 1 running on server 19 shown above, the proper command would be: | ||
| `./testbed-cli.sh -m k8s-ubuntu create-master k8s_server_19 ~/.password` | ||
| - OPTIONAL: We offer the functionality to run multiple master sets on one server. | ||
| - Each master set is one HA Kubernetes master composed of 4 Linux KVMs. | ||
| - Should an additional HA master set be necessary on an occupied server, add the option `-s <msetnumber>`, where `msetnumber` would be 2 if this is the 2nd master set running on `<k8s-server-name>`. Make sure that [`ansible/k8s-ubuntu`](../k8s-ubuntu) is updated accordingly. `msetnumber` is 1 by default. | ||
|
|
||
|
|
||
| 9. Join Kubernetes-enabled SONiC DUT to cluster (kube_join function to be written). | ||
|
|
||
|
|
||
| #### To remove a HA Kubernetes master: | ||
| - Run `./testbed-cli.sh -m k8s-ubuntu [additional OPTIONS] destroy-master <k8s-server-name> ~/.password` | ||
| - For HA Kubernetes master set 1 running on server 19 shown above, the proper command would be: | ||
| `./testbed-cli.sh -m k8s-ubuntu destroy-master k8s_server_19 ~/.password` | ||
|
|
||
| ## Testing Scope | ||
|
|
||
| This setup allows us to test the following: | ||
| - Successful deployment of SONiC containers via manifests defined in master | ||
| - Expected container behavior after the container is intentionally or unintentionally stopped | ||
| - Switching between Local and Kubernetes management mode for a given container | ||
| - Addition and removal of SONiC DUT labels | ||
| - Changing image version in middle of Daemonset deployment | ||
|
|
||
| During each of the following states: | ||
| - When all master servers are up and running | ||
| - When one master server is down | ||
| - When two master servers are down | ||
| - When all master servers are down | ||
|
|
||
| Down: shut off, disconnected, or in the middle of reboot | ||
|
|
||
|
|
||
| In this setup, we do not consider load balancer performance. For Kubernetes feature testing purposes, HAProxy is configured to perform vanilla round-robin load balancing on available master servers. | ||
|
|
||
|
|
||
| ## How to Create Tests | ||
| Each manifest is a yaml file | ||
|
|
||
| CLI to make changes to manifest files | ||
|
|
||
| pytests to apply manifest changes and check status |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,6 @@ | ||
| --- | ||
| ansible_ssh_user: ubuntu | ||
| ansible_ssh_pass: admin | ||
| ansible_become: True | ||
| become_method: sudo | ||
| ansible_become_password: admin |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,4 @@ | ||
| --- | ||
| ansible_user: use_own_value | ||
| ansible_password: use_own_value | ||
| ansible_become_password: use_own_value |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,4 @@ | ||
| k8s_root_path: /home/azure/ubuntu-vm | ||
| k8s_vm_images_url: https://acsbe.blob.core.windows.net/vmimages | ||
| #This image is from public URL https://cloud-images.ubuntu.com/bionic/current/bionic-server-cloudimg-amd64.img. To convert img to qcow2, run: `qemu-img convert -f qcow2 bionic-server-cloudimg-amd64.img bionic-server-cloudimg-amd64.qcow2` | ||
| k8s_hdd_image_filename: bionic-server-cloudimg-amd64.qcow2 | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,3 @@ | ||
| mgmt_bridge_k8s: br1 | ||
| mgmt_prefixlen_k8s: use_own_value | ||
| mgmt_gw_k8s: use_own_value |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,3 @@ | ||
| mgmt_bridge_k8s: br1 | ||
| mgmt_prefixlen_k8s: use_own_value | ||
| mgmt_gw_k8s: use_own_value |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,115 @@ | ||
| all: | ||
| children: | ||
| k8s_vm_host: | ||
| children: | ||
| k8s_vm_host19: | ||
| k8s_vm_host20: | ||
| k8s_ubu: | ||
| children: | ||
| k8s_vms1_19: | ||
| k8s_vms2_19: | ||
| k8s_vms1_20: | ||
| k8s_vms2_20: | ||
| k8s_servers: | ||
| children: | ||
| k8s_server_19: | ||
| k8s_server_20: | ||
|
|
||
|
|
||
| k8s_vm_host19: | ||
| hosts: | ||
| STR-ACS-SERV-19: | ||
| ansible_host: 10.251.0.101 | ||
|
|
||
| k8s_vm_host20: | ||
| hosts: | ||
| STR-ACS-SERV-20: | ||
| ansible_host: 10.251.0.102 | ||
|
|
||
| k8s_vms1_19: | ||
| hosts: | ||
| kvm19-1m1: | ||
| ansible_host: 10.251.0.103 | ||
| master: true | ||
| master_leader: true | ||
| kvm19-1m2: | ||
| ansible_host: 10.251.0.104 | ||
| master: true | ||
| master_member: true | ||
| kvm19-1m3: | ||
| ansible_host: 10.251.0.105 | ||
| master_member: true | ||
| master: true | ||
| kvm19-1ha: | ||
| ansible_host: 10.251.0.106 | ||
| haproxy: true | ||
|
|
||
| k8s_vms2_19: | ||
| hosts: | ||
| kvm19-2m1: | ||
| ansible_host: 10.251.0.107 | ||
| master: true | ||
| master_leader: true | ||
| kvm19-2m2: | ||
| ansible_host: 10.251.0.108 | ||
| master: true | ||
| master_member: true | ||
| kvm19-2m3: | ||
| ansible_host: 10.251.0.109 | ||
| master_member: true | ||
| master: true | ||
| kvm19-2ha: | ||
| ansible_host: 10.251.0.110 | ||
| haproxy: true | ||
|
|
||
| k8s_vms1_20: | ||
| hosts: | ||
| kvm20-1m1: | ||
| ansible_host: 10.251.0.111 | ||
| master: true | ||
| master_leader: true | ||
| kvm20-1m2: | ||
| ansible_host: 10.251.0.112 | ||
| master: true | ||
| master_member: true | ||
| kvm20-1m3: | ||
| ansible_host: 10.251.0.113 | ||
| master_member: true | ||
| master: true | ||
| kvm20-1ha: | ||
| ansible_host: 10.251.0.114 | ||
| haproxy: true | ||
|
|
||
| k8s_vms2_20: | ||
| hosts: | ||
| kvm20-2m1: | ||
| ansible_host: 10.251.0.115 | ||
| master: true | ||
| master_leader: true | ||
| kvm20-2m2: | ||
| ansible_host: 10.251.0.116 | ||
| master: true | ||
| master_member: true | ||
| kvm20-2m3: | ||
| ansible_host: 10.251.0.117 | ||
| master_member: true | ||
| master: true | ||
| kvm20-2ha: | ||
| ansible_host: 10.251.0.118 | ||
| haproxy: true | ||
|
|
||
| # The groups below are helper to limit running playbooks to specific server(s) only | ||
| k8s_server_19: | ||
| vars: | ||
| host_var_file: host_vars/STR-ACS-SERV-19.yml | ||
| children: | ||
| k8s_vm_host19: | ||
| k8s_vms1_19: | ||
|
|
||
| k8s_server_20: | ||
| vars: | ||
| host_var_file: host_vars/STR-ACS-SERV-20.yml | ||
| children: | ||
| k8s_vm_host20: | ||
| k8s_vms1_20: | ||
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,22 @@ | ||
| - name: update apt cache | ||
| apt: update_cache=yes cache_valid_time=3600 | ||
| environment: "{{ proxy_env | default({}) }}" | ||
|
|
||
| - name: Install haproxy | ||
| apt: name=haproxy state=present | ||
| environment: "{{ proxy_env | default({}) }}" | ||
|
|
||
| - name: Enable init script | ||
| replace: dest='/etc/default/haproxy' | ||
| regexp='ENABLED=0' | ||
| replace='ENABLED=1' | ||
|
|
||
| - name: Setup haproxy config file | ||
| template: | ||
| src: haproxy.j2 | ||
| dest: /etc/haproxy/haproxy.cfg | ||
| backup: yes | ||
|
|
||
| - name: Restart HAProxy | ||
| become: yes | ||
| service: name=haproxy state=restarted |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,73 @@ | ||
| #--------------------------------------------------------------------- | ||
| global | ||
| # to have these messages end up in /var/log/haproxy.log you will | ||
| # need to: | ||
| # | ||
| # 1) configure syslog to accept network log events. This is done | ||
| # by adding the '-r' option to the SYSLOGD_OPTIONS in | ||
| # /etc/sysconfig/syslog | ||
| # | ||
| # 2) configure local2 events to go to the /var/log/haproxy.log | ||
| # file. A line like the following can be added to | ||
| # /etc/sysconfig/syslog | ||
| # | ||
| # local2.* /var/log/haproxy.log | ||
| # | ||
| log 127.0.0.1 local2 | ||
|
|
||
| chroot /var/lib/haproxy | ||
| pidfile /var/run/haproxy.pid | ||
| maxconn 4000 | ||
| user haproxy | ||
| group haproxy | ||
| daemon | ||
|
|
||
| # turn on stats unix socket | ||
| stats socket /var/lib/haproxy/stats | ||
|
|
||
| #--------------------------------------------------------------------- | ||
| # common defaults that all the 'listen' and 'backend' sections will | ||
| # use if not designated in their block | ||
| #--------------------------------------------------------------------- | ||
| defaults | ||
| mode http | ||
| log global | ||
| option httplog | ||
| option dontlognull | ||
| option http-server-close | ||
| option forwardfor except 127.0.0.0/8 | ||
| option redispatch | ||
| retries 3 | ||
| timeout http-request 10s | ||
| timeout queue 1m | ||
| timeout connect 10s | ||
| timeout client 1m | ||
| timeout server 1m | ||
| timeout http-keep-alive 10s | ||
| timeout check 10s | ||
| maxconn 3000 | ||
|
|
||
| #--------------------------------------------------------------------- | ||
| # main frontend which proxys to the backends | ||
| #--------------------------------------------------------------------- | ||
| frontend k8s-api | ||
| bind 0.0.0.0:80 | ||
| mode tcp | ||
| option tcplog | ||
| default_backend k8s-api | ||
|
|
||
| #--------------------------------------------------------------------- | ||
| # round robin balancing between the various backends | ||
| #--------------------------------------------------------------------- | ||
| backend k8s-api | ||
| mode tcp | ||
| option tcplog | ||
| option tcp-check | ||
| balance roundrobin | ||
| default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100 | ||
|
|
||
| {% for host in groups['k8s_vms' + msetnumber + '_' + servernumber] %} | ||
| {% if hostvars[host].master is defined %} | ||
| server {{ hostvars[host].inventory_hostname }} {{ hostvars[host].ansible_host }}:6443 check | ||
| {% endif %} | ||
| {% endfor %} |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.