|
| 1 | +--- |
| 2 | +layout: home |
| 3 | +title: Deployment behind a load balancer |
| 4 | +nav_order: 12 |
| 5 | +nav_titles: true |
| 6 | +titles_max_depth: 2 |
| 7 | +--- |
| 8 | + |
| 9 | +## Overview |
| 10 | + |
| 11 | +Supporting MPTCP on the server side is easy when services are directly exposed |
| 12 | +to the Internet: it's generally just a matter of enabling MPTCP support in the |
| 13 | +[applications](apps.html), or |
| 14 | +[forcing them to use it](setup.html#force-applications-to-use-mptcp), that's it. |
| 15 | + |
| 16 | +When services are exposed behind a L4 load balancer, it is important to make |
| 17 | +sure the additional subflows will reach the same end-server, and not another one |
| 18 | +sharing the same public IP (ECMP or Anycast servers). |
| 19 | + |
| 20 | +### First path: no change |
| 21 | + |
| 22 | +Creating the first subflow (or *path*) is easy: on the wire, this MPTCP subflow |
| 23 | +is seen as a TCP connection with extra TCP options. It means nothing needs to be |
| 24 | +modified. |
| 25 | + |
| 26 | +```mermaid |
| 27 | +flowchart LR |
| 28 | + C("fa:fa-mobile<br />Client") == Initial subflow ==> LB{"fa:fa-cloud<br />Load Balancer"} |
| 29 | + LB -.-> S1["fa:fa-server<br />Server 1"] |
| 30 | + LB == Initial subflow ==> S2["fa:fa-server<br />Server 2"] |
| 31 | + LB -.-> S3["fa:fa-server<br />Server 3"] |
| 32 | +
|
| 33 | + linkStyle 0 stroke:green,fill:none |
| 34 | + linkStyle 2 stroke:green,fill:none |
| 35 | +``` |
| 36 | + |
| 37 | +### Extra paths: static redirection |
| 38 | + |
| 39 | +The extra subflows need to reach the same end-server. Such subflows will have |
| 40 | +different source IP addresses and/or ports. A stateless L4 load-balancer needs |
| 41 | +extra information to pick the same end-server as the one which accepted the |
| 42 | +initial subflow. |
| 43 | + |
| 44 | +```mermaid |
| 45 | +flowchart LR |
| 46 | + C("fa:fa-mobile<br />Client") -- Initial subflow --> LB{"fa:fa-cloud<br />Load Balancer"} |
| 47 | + C == Second subflow ==> LB |
| 48 | + LB -.-> S1["fa:fa-server<br />Server 1"] |
| 49 | + LB -- Initial subflow --> S2["fa:fa-server<br />Server 2"] |
| 50 | + LB == Second subflow ==> S2 |
| 51 | + LB -.-> S3["fa:fa-server<br />Server 3"] |
| 52 | +
|
| 53 | + linkStyle 0 stroke:green,fill:none |
| 54 | + linkStyle 1 stroke:orange,fill:none |
| 55 | + linkStyle 3 stroke:green,fill:none |
| 56 | + linkStyle 4 stroke:orange,fill:none |
| 57 | +``` |
| 58 | + |
| 59 | +If the extra subflows try to connect to the same destination IP address and |
| 60 | +port, a stateless L4 load-balancer will not be able to pick the right server. |
| 61 | + |
| 62 | +### Solution |
| 63 | + |
| 64 | +The [MPTCP protocol](https://www.rfc-editor.org/rfc/rfc8684.html) suggests |
| 65 | +handling this case like this: |
| 66 | +- A server behind a L4 load-balancer should mention in its replies to MPTCP |
| 67 | + connection requests (`MP_CAPABLE`) that *it will not accept additional MPTCP |
| 68 | + subflows to the same IP address and port* (via the `C-flag`). |
| 69 | +- Additionally, such server should announce an extra address (`ADD_ADDR`) with a |
| 70 | + v4/v6 IP address and/or port that are specific to this server. |
| 71 | +- A L4 load-balancer should route traffic to this specific IP and/or port to the |
| 72 | + right server. |
| 73 | + |
| 74 | +In other words, on Linux, it means that each server should: |
| 75 | +- set the [`net.mptcp.allow_join_initial_addr_port`](https://docs.kernel.org/networking/mptcp-sysctl.html) |
| 76 | + sysctl knob to `0` |
| 77 | +- add a `signal` MPTCP endpoint with a dedicated IP address and/or port: |
| 78 | + ``` |
| 79 | + ip mptcp endpoint add <public IP address> dev <interface> [ port NR ] signal |
| 80 | + ``` |
| 81 | + |
| 82 | +{: .note} |
| 83 | +A stateful load-balancer could compute the MPTCP receiver's token from its key |
| 84 | +exchanged in the connection request (`MP_CAPABLE`), and route additional |
| 85 | +subflows to the same server by identifying the receiver's token from the join |
| 86 | +request (`MP_JOIN`). Be careful that there is a risk of token collision, and |
| 87 | +such load-balancer should handle the case where multiple end-servers are using |
| 88 | +the same token for active MPTCP connections. |
| 89 | + |
| 90 | +## CDNs |
| 91 | + |
| 92 | +Supporting MPTCP would be beneficial for the users, to be able to easily benefit |
| 93 | +from MPTCP: seamless handovers, best network selection, and network aggregation. |
| 94 | + |
| 95 | +Here is a checklist for CDN owners implementing MPTCP support: |
| 96 | +- [ ] Frontend: |
| 97 | + - [ ] Application: [enable MPTCP support](apps.html), |
| 98 | + [modify it to create an MPTCP socket](implementation.html), or |
| 99 | + [force it to use MPTCP](setup.html#force-applications-to-use-mptcp). |
| 100 | + - [ ] System: set [`sysctl net.mptcp.allow_join_initial_addr_port=0`](https://docs.kernel.org/networking/mptcp-sysctl.html) |
| 101 | + - [ ] System: Add a `signal` MPTCP endpoint with a dedicated IP v4/v6 and/or |
| 102 | + port per end-server: |
| 103 | + ``` |
| 104 | + ip mptcp endpoint add <public IP address> dev <interface> [ port NR ] signal |
| 105 | + ``` |
| 106 | +- [ ] Stateless L4 Load-Balancer: |
| 107 | + - [ ] Add rules to route TCP flows to a specific IP and/or port to the |
| 108 | + corresponding server. |
| 109 | + - [ ] Optionally block all non MPTCP connections, and rate limit connections |
| 110 | + requests. |
0 commit comments