Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
129 changes: 129 additions & 0 deletions docs/testbed/README.testbed.Testbed_v2.hld.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
# Testbed V2 Design
# High Level Design Document


# Table of Contents
* [About this Manual](#about-this-manual)
* [Requirement Overview](#Requirement-Overview)
* [Components](#Components)
* [Database Schema](#Database-Schema)
* [Implementation Plan](#Implementation-Plan)


## About this Manual
This documentation provides general information about the Testbed V2 feature implementation for `sonic-mgmt`. Testbed V2 aims to support multi-DUTs deployment with dynamic VLAN assignment and inter-DUTs link state propagation. Based on the Redis Pub/Sub paradigm, the connection status changes(physical connections, virtual connections, VLAN assignments, etc) from test users into `connection_db` will signal the daemons over fanout switches about the key-space events to manipulate the physical devices/ports.

## Requirement Overview
* Maintain compatibility with existing testbed.
* Support testbed/topology with multiple DUTs that have inter-DUTs connections.
* Support flexible VLAN assignment.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to do health monitoring?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That will be a nice feature to have. So who will receive the health monitoring statistics?


## Components
* `test user`
* `connection_db`
* Redis databases running on certain test server hosting connection-related metadata.
* `servercfgd`
* running on the same server as `connection_db`.
* a RPC server that is responsible for `connection_db` initial setup and provision.
* `labcfgd`
* running on root/fanout switches
* subscribe to keyspace events of `connection_db` and act accordingly.

## Database Schema
* `DB_CONNECTION_GRAPH_VERSIONS`
```
; ZSET that stores md5sum values of connection graph files that are used to provision `connection_db`
key = LAB_CONNECTION_GRAPH_VERSIONS
```
* `DB_META`
```
; Defines database metadata
key = DB_META
; field = value
DBState = "active"/"provisioning"/"down"
```
* `SWITCH_TABLE`
```
; Defines switch metadata
key = SWITCH_TABLE:switch_name
; field = value
HwSku = ; switch platform hwsku
ManagementIp =
Type = "leaf_fanout"/"root_fanout"/"dev_sonic"
ProvisionStatus = "not_provisioned"/"in_progress"/"provisioned" ; provision status for "dev_sonic"
```
* `DUT_LIST`
```
; List contains all the SONiC DUTs defined in the lab
key = DUT_LIST ; contains DUT names that are FK to `SWITCH_TABLE`
```
* `SERVER_TABLE`
```
; Defines server metadata
key = SERVER_TABLE:server_name
; field = value
HwSku = "TestServ"
ManagementIp =
ServerStatus = "active"/"down"
```
* `PORT_LIST`
```
; List contains physical ports of either a root/leaf fanout switch, DUT or server
key = PORT_LIST:<switch_name|dut_name|server_name>
```
* `PORT_TABLE`
```
; Defines port metadata
key = PORT_TABLE:<switch_name|dut_name|server_name>:port_name
; field = value
BandWidth =
VlanType = "access"/"trunk"
PhyPeerPort = ; physical peer port
```
* `VLAN_LIST`
```
; List contains VLAN ids assigned to a physical port
key = VLAN_LIST:endport ; endport is FK to `PORT_TABLE`
```
* `USED_VLANIDPOOL_SET`
```
; Set contains used available VLAN ids
key = VLANIDPOOL_SET
```
* `VIRTLINK_TABLE`
```
; a virtual link between DUTs
key = VIRTLINK_TABLE:endport0:endport1 ; endport0 and endport1 are FK to `PORT_TABLE
; field = value
Status = "active"/"inactive"
```

## Implementation Plan
* The whole process is divided into three stages:
1. stage#1: initial `connection_db` setup and provision
2. stage#2: dynamic vlan assignment support
3. stage#3: link state propagation support

### Stage#1
* In stage#1, we only cover initial `connection_db` setup and provision:
1. Install Redis and its Python packages over selected test server.
2. Run `servercfgd` over the selected test server.
3. Provision the `connection_db` with the physical connections defined in the connection graph file.
* All the modules/plays to cover the functionalities list above will be included in Ansible role `connection_db`.
* Users are provided with a play `config_connection_db.yml` to call `connection_db` role to setup, provision or remove `connection_db`.

#### db setup
* Ensure Redis and py-redis packages are installed.
* Ensure Redis service is running.
* start `connection_db`.

![start_db](img/testbed_v2_start_db.png)

#### db provision
* Provision the `connection_db` with the physical connections and static VLAN assignments defined in the connection graph file defined in [ansible/files](https://github.com/Azure/sonic-mgmt/tree/master/ansible/files).

![provision_db](img/testbed_v2_provision_db.png)


#### changes to `conn_graph_files`
There will be an extra parameter added to `conn_graph_facts`: `conn_graph_facts_src`, it could be either `from_db`, which will retrieve the connection data from `connection_db`, or it could be `from_file` to get the data from parsing connection graph file like before. One thing to notice is that if `conn_graph_facts` fails with `conn_graph_facts_src=from_db`, it will fall back to `from_file`.
Binary file added docs/testbed/img/testbed_v2_provision_db.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/testbed/img/testbed_v2_start_db.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.