Skip to content

Use monit validate to get live data#19759

Merged
StormLiangMS merged 2 commits intosonic-net:masterfrom
dhanasekar-arista:monit_validate
Jul 25, 2025
Merged

Use monit validate to get live data#19759
StormLiangMS merged 2 commits intosonic-net:masterfrom
dhanasekar-arista:monit_validate

Conversation

@dhanasekar-arista
Copy link
Copy Markdown
Contributor

Description of PR

Summary:
'monit status' provides a stale 60 secs old data, this is not ideal in certain scenarios.
We should use the live data , 'monit validate' provides the live data.

Fixes # (issue)
https://github.com/aristanetworks/sonic-qual.msft/issues/718

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • New Test case
    • Skipped for non-supported platforms
  • Test case improvement

Back port request

  • 202205
  • 202305
  • 202311
  • 202405
  • 202411
  • 202505

Approach

What is the motivation for this PR?

disk/test_disk_exhaustion.py creates a 1.7G file in the test and deletes it at the end of the test.
But "monit status" is configured to check only once every 60 secs in /etc/monit/monitrc.
This provides a stale data resulting in memory high threshold getting breached.

How did you do it?

We should use "monit validate" instead of "monit status"

How did you verify/test it?

verified by running the test

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

'monit status' fetches a stale 60 secs old data, this is not ideal.
We should use the live data , 'monit validate' provides the live data
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@dhanasekar-arista
Copy link
Copy Markdown
Contributor Author

@StormLiangMS @lipxu Kindly review and merge.

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@dhanasekar-arista
Copy link
Copy Markdown
Contributor Author

@StormLiangMS can you please review this.

Copy link
Copy Markdown
Collaborator

@StormLiangMS StormLiangMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@StormLiangMS StormLiangMS merged commit 7bd3106 into sonic-net:master Jul 25, 2025
18 checks passed
mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Jul 28, 2025
What is the motivation for this PR?
disk/test_disk_exhaustion.py creates a 1.7G file in the test and deletes it at the end of the test.
But "monit status" is configured to check only once every 60 secs in /etc/monit/monitrc.
This provides a stale data resulting in memory high threshold getting breached.

How did you do it?
We should use "monit validate" instead of "monit status"

How did you verify/test it?
verified by running the test
@mssonicbld
Copy link
Copy Markdown
Collaborator

Cherry-pick PR to 202505: #19853

@dhanasekar-arista dhanasekar-arista deleted the monit_validate branch July 30, 2025 05:18
mssonicbld pushed a commit that referenced this pull request Jul 30, 2025
What is the motivation for this PR?
disk/test_disk_exhaustion.py creates a 1.7G file in the test and deletes it at the end of the test.
But "monit status" is configured to check only once every 60 secs in /etc/monit/monitrc.
This provides a stale data resulting in memory high threshold getting breached.

How did you do it?
We should use "monit validate" instead of "monit status"

How did you verify/test it?
verified by running the test
nissampa pushed a commit to nissampa/sonic-mgmt_dpu_test that referenced this pull request Aug 7, 2025
What is the motivation for this PR?
disk/test_disk_exhaustion.py creates a 1.7G file in the test and deletes it at the end of the test.
But "monit status" is configured to check only once every 60 secs in /etc/monit/monitrc.
This provides a stale data resulting in memory high threshold getting breached.

How did you do it?
We should use "monit validate" instead of "monit status"

How did you verify/test it?
verified by running the test
ashutosh-agrawal pushed a commit to ashutosh-agrawal/sonic-mgmt that referenced this pull request Aug 14, 2025
What is the motivation for this PR?
disk/test_disk_exhaustion.py creates a 1.7G file in the test and deletes it at the end of the test.
But "monit status" is configured to check only once every 60 secs in /etc/monit/monitrc.
This provides a stale data resulting in memory high threshold getting breached.

How did you do it?
We should use "monit validate" instead of "monit status"

How did you verify/test it?
verified by running the test
vidyac86 pushed a commit to vidyac86/sonic-mgmt that referenced this pull request Oct 23, 2025
What is the motivation for this PR?
disk/test_disk_exhaustion.py creates a 1.7G file in the test and deletes it at the end of the test.
But "monit status" is configured to check only once every 60 secs in /etc/monit/monitrc.
This provides a stale data resulting in memory high threshold getting breached.

How did you do it?
We should use "monit validate" instead of "monit status"

How did you verify/test it?
verified by running the test
opcoder0 pushed a commit to opcoder0/sonic-mgmt that referenced this pull request Dec 8, 2025
What is the motivation for this PR?
disk/test_disk_exhaustion.py creates a 1.7G file in the test and deletes it at the end of the test.
But "monit status" is configured to check only once every 60 secs in /etc/monit/monitrc.
This provides a stale data resulting in memory high threshold getting breached.

How did you do it?
We should use "monit validate" instead of "monit status"

How did you verify/test it?
verified by running the test

Signed-off-by: opcoder0 <110003254+opcoder0@users.noreply.github.com>
gshemesh2 pushed a commit to gshemesh2/sonic-mgmt that referenced this pull request Dec 16, 2025
What is the motivation for this PR?
disk/test_disk_exhaustion.py creates a 1.7G file in the test and deletes it at the end of the test.
But "monit status" is configured to check only once every 60 secs in /etc/monit/monitrc.
This provides a stale data resulting in memory high threshold getting breached.

How did you do it?
We should use "monit validate" instead of "monit status"

How did you verify/test it?
verified by running the test

Signed-off-by: Guy Shemesh <gshemesh@nvidia.com>
AharonMalkin pushed a commit to AharonMalkin/sonic-mgmt that referenced this pull request Dec 16, 2025
What is the motivation for this PR?
disk/test_disk_exhaustion.py creates a 1.7G file in the test and deletes it at the end of the test.
But "monit status" is configured to check only once every 60 secs in /etc/monit/monitrc.
This provides a stale data resulting in memory high threshold getting breached.

How did you do it?
We should use "monit validate" instead of "monit status"

How did you verify/test it?
verified by running the test

Signed-off-by: Aharon Malkin <amalkin@nvidia.com>
gshemesh2 pushed a commit to gshemesh2/sonic-mgmt that referenced this pull request Dec 21, 2025
What is the motivation for this PR?
disk/test_disk_exhaustion.py creates a 1.7G file in the test and deletes it at the end of the test.
But "monit status" is configured to check only once every 60 secs in /etc/monit/monitrc.
This provides a stale data resulting in memory high threshold getting breached.

How did you do it?
We should use "monit validate" instead of "monit status"

How did you verify/test it?
verified by running the test

Signed-off-by: Guy Shemesh <gshemesh@nvidia.com>
venu-nexthop pushed a commit to venu-nexthop/sonic-mgmt that referenced this pull request Jan 13, 2026
What is the motivation for this PR?
disk/test_disk_exhaustion.py creates a 1.7G file in the test and deletes it at the end of the test.
But "monit status" is configured to check only once every 60 secs in /etc/monit/monitrc.
This provides a stale data resulting in memory high threshold getting breached.

How did you do it?
We should use "monit validate" instead of "monit status"

How did you verify/test it?
verified by running the test
gshemesh2 pushed a commit to gshemesh2/sonic-mgmt that referenced this pull request Jan 26, 2026
What is the motivation for this PR?
disk/test_disk_exhaustion.py creates a 1.7G file in the test and deletes it at the end of the test.
But "monit status" is configured to check only once every 60 secs in /etc/monit/monitrc.
This provides a stale data resulting in memory high threshold getting breached.

How did you do it?
We should use "monit validate" instead of "monit status"

How did you verify/test it?
verified by running the test

Signed-off-by: Guy Shemesh <gshemesh@nvidia.com>
ytzur1 pushed a commit to ytzur1/sonic-mgmt that referenced this pull request Feb 2, 2026
What is the motivation for this PR?
disk/test_disk_exhaustion.py creates a 1.7G file in the test and deletes it at the end of the test.
But "monit status" is configured to check only once every 60 secs in /etc/monit/monitrc.
This provides a stale data resulting in memory high threshold getting breached.

How did you do it?
We should use "monit validate" instead of "monit status"

How did you verify/test it?
verified by running the test

Signed-off-by: Yael Tzur <ytzur@nvidia.com>
venu-nexthop pushed a commit to venu-nexthop/sonic-mgmt that referenced this pull request Mar 27, 2026
What is the motivation for this PR?
disk/test_disk_exhaustion.py creates a 1.7G file in the test and deletes it at the end of the test.
But "monit status" is configured to check only once every 60 secs in /etc/monit/monitrc.
This provides a stale data resulting in memory high threshold getting breached.

How did you do it?
We should use "monit validate" instead of "monit status"

How did you verify/test it?
verified by running the test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants