[syncd] Add fdb notification event throttle#1041
[syncd] Add fdb notification event throttle#1041kcudnik wants to merge 1 commit intosonic-net:masterfrom
Conversation
|
can you add description? |
|
instead of "throttling will limit number of messages to 10 events per 1 second per MAC", can we always keep the latest notification for the mac/vlan combination? |
|
that was my initial approach, but it have some issues, fist, it would require to unnecessary serialize/deserialize fdb messages into event queues, and most important, if there is a storm of messages, what and when is exactly last message? it's hard to determine, one of approaches i was thinking, that we could delay processing messages, then check the queue for potential duplicates, and remove them, but this would be not efficient for systems with a lot of mac addresses and it will require to scan every item in the queue for each event, so complexity here would be quadratic, and if we would make a queue per mac, then we would lost order of incomming events with current approach, order of messages is preserved, since messages are filtered before inserting to the queue, on per mac basis, we also don't need to serialize and deserialize messages, downside is that we can't determine when storm will be over do you have any other suggestions or approach how we could implement here? |
|
Ideally, we should only keep the last learnt (or moved to) location of the learnt MAC. I believe some SAI vendor implements the move via aged/learnt pair of events while some other vendor that implemented the MOVE event notification will just have a single Move event whenever the MAC moved. I do not see how one can achieve throttle the events and still maintain sync with the SAI layer without having to store the last set of event(s) for a particular MAC. There are several issues if we simply drop the events when a threshold is reached. it makes the view of the MAC table as perceived by SONiC out of sync with SAI layer. worst yet if the last event that was dropped was an aged event then the MAC will stay on SONiC side forever until this same MAC is relearnt again by SAI layer (which may never happen). Best would be for this filtering be done at the lowest layer (SAI layer). But if this is not achievable, we need to store and filter at some time intervals and then report the MAC events with the latest notified info from SAI... |
yes, i'm aware of this problem, the same problem exists on current implementation, when events are dropped, there is a possibility of getting out of syncd on that fdb move, between SAI and sairedis because of dropped events, i don't see how this could be easy fixed, but at list this PR will not forward 100k or more events to sairedis
yes, current code has this issue
the solution i was thinking was described in previous post, to store events for example for 1 sec, then try to filter duplicates, and then process, but as mentioned it introduces unnecessary serialization/desrialization, n2 complexity and delay on fdb events |
|
replaced by #1060 |
When L2 loop happen, a mac can jump between ports very rapid, and it will cause flood of fdb notification events like (aged, learned, aged learned) between 2 ports. Vendor sai will produce stream of messages that will use 100% cpu, and syncd will be trying to push those notifications to OA. To prevent this, we are adding event throttle on fdb events per MAC basis, and by default that throttle will limit number of messages to 10 events per 1 second per MAC.