Skip to content

[nvidia][hsflowd] Fix Dropmon co-operation issues related to HW stop#73

Closed
vivekrnv wants to merge 5 commits intomasterfrom
hsflowd_hw_stop
Closed

[nvidia][hsflowd] Fix Dropmon co-operation issues related to HW stop#73
vivekrnv wants to merge 5 commits intomasterfrom
hsflowd_hw_stop

Conversation

@vivekrnv
Copy link
Owner

@vivekrnv vivekrnv commented Aug 19, 2023

Why I did it

Sflow service stop is causing other dropmon clients to not receive any drops thereafter,

Repro Steps:

config feature state sflow disabled 
config sflow enable
config sflow collector add temp1 192.168.1.10 
	
<Start other dropmon clients>
<Client will be receiving the drops >

config sflow disable <Exit hsflowd process>	
<Other client will stop recieving drops>
Work item tracking
  • Microsoft ADO (number only):

How I did it

  • During process exit, don't stop HW in Drop Mon. for hsflowd. HW Drops are controlled by other daemon in nvidia platform.
  • As for SW drops, only start sw drops in NET_DM when the sw=on is provided in hsflowd.conf
  • Don't log feedcontrolerrors for CONFIG since if feedcontrolerrors > 0, application won't stop sw drops even when it exits. CONFIG can likely fail with -EBUSY if the NET_DM is already configured by another daemon.

How to verify it

  1. Verify the steps and see if the client is receiving the drops, default sw=off case.
Aug 19 00:32:44.212065 r-leopard-41 NOTICE sflow#sflowmgrd: :- sflowHandleService: Starting hsflowd service
Aug 19 00:32:44.212237 r-leopard-41 INFO sflow#hsflowd: started
Aug 19 00:32:44.212282 r-leopard-41 INFO sflow#hsflowd: autoload SONIC and PSAMPLE modules
Aug 19 00:32:44.212282 r-leopard-41 INFO sflow#hsflowd: drop-monitor support for SONiC
Aug 19 00:33:40.446770 r-leopard-41 INFO sflow#hsflowd: dropmon state INIT -> GET_FAMILY
Aug 19 00:33:40.446770 r-leopard-41 INFO sflow#hsflowd: dropmon state GET_FAMILY -> GOT_GROUP
Aug 19 00:33:41.037737 r-leopard-41 INFO sflow#hsflowd: dropmon state GOT_GROUP -> JOIN_GROUP
Aug 19 00:33:42.042191 r-leopard-41 INFO sflow#hsflowd: dropmon state JOIN_GROUP -> CONFIGURE
Aug 19 00:33:42.042191 r-leopard-41 INFO sflow#hsflowd: Configuring DropMon Failed, Module is already in Monitoring State, Continue...
Aug 19 00:33:42.042191 r-leopard-41 INFO sflow#hsflowd: message repeated 2 times: [ Configuring DropMon Failed, Module is already in Monitoring State, Continue...]
Aug 19 00:33:42.042191 r-leopard-41 INFO sflow#hsflowd: dropmon state CONFIGURE -> START
Aug 19 00:34:42.042191 r-leopard-41 INFO sflow#hsflowd: dropmon state START -> RUN
Aug 19 00:35:07.746616 r-leopard-41 INFO sflow#hsflowd: Received SIGTERM
Aug 19 00:35:07.445474 r-leopard-41 INFO sflow#hsflowd: dropmon state RUN -> STOP
Aug 19 00:35:07.795261 r-leopard-41 INFO sflow#hsflowd: stopped
  1. Test the sw=on case
Aug 19 00:55:29.802990 r-leopard-41 INFO sflow#hsflowd: started
Aug 19 00:55:29.802990 r-leopard-41 INFO sflow#hsflowd: autoload SONIC and PSAMPLE modules
Aug 19 00:55:29.802990 r-leopard-41 INFO sflow#hsflowd: drop-monitor support for SONiC
Aug 19 00:55:38.213225 r-leopard-41 INFO sflow#hsflowd: dropmon state INIT -> GET_FAMILY
Aug 19 00:55:38.213225 r-leopard-41 INFO sflow#hsflowd: dropmon state GET_FAMILY -> GOT_GROUP
Aug 19 00:55:38.804221 r-leopard-41 INFO sflow#hsflowd: dropmon state GOT_GROUP -> JOIN_GROUP
Aug 19 00:55:39.808749 r-leopard-41 INFO sflow#hsflowd: dropmon state JOIN_GROUP -> CONFIGURE
Aug 19 00:55:39.808749 r-leopard-41 INFO sflow#hsflowd: Configuring DropMon Failed, Module is already in Monitoring State, Continue...
Aug 19 00:55:40.813554 r-leopard-41 INFO sflow#hsflowd: message repeated 2 times: [ Configuring DropMon Failed, Module is already in Monitoring State, Continue...]
Aug 19 00:55:40.813554 r-leopard-41 INFO sflow#hsflowd: dropmon state CONFIGURE -> START
Aug 19 00:55:42.430937 r-leopard-41 INFO sflow#hsflowd: dropmon state START -> RUN
Aug 19 00:56:36.441267 r-leopard-41 INFO sflow#hsflowd: Received SIGTERM
Aug 19 00:56:36.445474 r-leopard-41 INFO sflow#hsflowd: dropmon: graceful shutdown: turning off feed
Aug 19 00:56:36.445474 r-leopard-41 INFO sflow#hsflowd: dropmon state RUN -> STOP
Aug 19 00:56:36.507932 r-leopard-41 INFO sflow#hsflowd: stopped

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111
  • 202205
  • 202211
  • 202305

Tested branch (Please provide the tested image version)

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

@vivekrnv vivekrnv marked this pull request as draft August 21, 2023 20:11
@vivekrnv vivekrnv marked this pull request as ready for review August 22, 2023 01:59
@vivekrnv vivekrnv marked this pull request as draft August 31, 2023 18:07
@vivekrnv vivekrnv closed this Sep 13, 2023
vivekrnv pushed a commit that referenced this pull request Oct 13, 2023
…e latest HEAD automatically (sonic-net#15016)

src/wpasupplicant/sonic-wpa-supplicant

* a24412c25 - (HEAD -> 202205, origin/master, origin/HEAD, origin/202211, origin/202205, master) [mka]: Fix unexpected cleanup (#73) (8 days ago) [Ze Gan]
* 26d1da0bc - [mka]: Fix re-establishment by reset MI (#72) (8 days ago) [Ze Gan]
* f07e0a097 - [azp]: Update build pipeline to build for Bullseye (#70) (4 weeks ago) [Ze Gan]
*   2c69e2cda - Use github code scanning instead of LGTM (#69) (6 months ago) [Liu Shilong]
|\  
| * 23abb04e5 - fix (6 months ago) [shilongliu]
| * f34d68fe6 - libdbus-1-dev (6 months ago) [shilongliu]
| * dc2dd881e - add dbus (6 months ago) [shilongliu]
| * 5de037661 - use swsscommon packages (6 months ago) [shilongliu]
| * 32c5a2729 - Use github code scanning instead of LGTM (6 months ago) [shilongliu]
|/  
* aa731b96f - [azp]: Install libyang in azure pipeline (#68) (8 months ago) [Hua Liu]
* 71b635d74 - Revert "[Azp]: Upgrade Azp to bullseye (#49)" (#66) (9 months ago) [Ze Gan]
* 7aa4e6fa4 - Adding Microsoft SECURITY.MD (#58) (9 months ago) [microsoft-github-policy-service[bot]]
vivekrnv pushed a commit that referenced this pull request Jul 11, 2025
…ically (sonic-net#23211)

#### Why I did it
src/sonic-dash-ha
```
* ff1b02a - (HEAD -> master, origin/master, origin/HEAD) Implement HA-Scope actor (#73) (20 hours ago) [yue-fred-gao]
* 832e67a - Implement HA-SET actor (#70) (4 days ago) [yue-fred-gao]
* cadfa46 - Implement vdpu actor (#67) (5 days ago) [yue-fred-gao]
```
#### How I did it
#### How to verify it
#### Description for the changelog
vivekrnv pushed a commit that referenced this pull request Nov 20, 2025
…ly (sonic-net#24533)

#### Why I did it
src/sonic-stp
```
* d30e086 - (HEAD -> master, origin/master, origin/HEAD) Fix issues with PDU structure alignment, IPC msg processing (#73) (4 days ago) [Yogapriya-cisco]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant