Skip to content

Commit 6dc1d54

Browse files
committed
add docs about net-topology plugin
Signed-off-by: lowang-bh <[email protected]>
1 parent e1ee888 commit 6dc1d54

File tree

1 file changed

+61
-0
lines changed

1 file changed

+61
-0
lines changed

docs/design/net-aware.md

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
# Net Topology Aware Plugin
2+
3+
## Backgrounds
4+
5+
Usually, a kubernetes cluster has many nodes, and those nodes are in different idc, chassis, and even different switches.
6+
Data transformations across different idc, chassis and switches has different performance. Some latency sensitive workloads are need to run in same idc, even in same topology devices, such as chassis and switch.
7+
8+
## Motivation
9+
10+
We target to make scheduler Net topology aware so as to achieve the following:
11+
12+
- best effort to schedule same job to same topology devices.
13+
14+
## Goals
15+
16+
- Support single key topology configuration, try to schedule job's all tasks to nodes which have same value with that key
17+
- Support multiple-key topology policies, the more front get higher score
18+
19+
## Non-Goals
20+
21+
- Not to find the global solutions among nodes with all values of that key
22+
23+
## Design Action
24+
25+
### Pod scheduling process
26+
27+
1. when the first task of a job is allocated to a node, record the node information in the plugin
28+
2. when scheduling other tasks, a node with same key-value as the target node, get a higher score. Otherwise, get a zero score.
29+
3. If a node has multiple keys same as the configured list, the first key matching the configured keys has higher score
30+
31+
```go
32+
nodeOrderFn := func(task *api.TaskInfo, node *api.NodeInfo) (float64, error){
33+
...
34+
score := 0
35+
weight := np.weight
36+
tlabels := tNode.Node.Labels
37+
labels := node.Node.Labels
38+
lenth := len(np.topologyKeys)
39+
for i, key := range np.topologyKeys {
40+
if tlabels[key] == labels[key] {
41+
score += (lenth - i) // key with more priority at front of which with less priority
42+
break
43+
}
44+
}
45+
return float64(score * weight), nil
46+
}
47+
```
48+
49+
### Usage
50+
51+
```yaml
52+
- plugins:
53+
- name: net-topology
54+
arguments:
55+
net-topology.keys: switch,idc
56+
net-topology.weight: 10
57+
```
58+
59+
## Drawbacks
60+
61+
It is not a global solution which put a job's all tasks in same topology nodes. For example, nodes list with key-value1 has not enough resource, but nodes list with key-value2 does, if first task was bind to nodes with key-value1, then other tasks will all try that nodes list.

0 commit comments

Comments
 (0)