Skip to content

[Mellanox] Add support for SN4600 system#2

Closed
DavidZagury wants to merge 1 commit intomasterfrom
sku-msn4600
Closed

[Mellanox] Add support for SN4600 system#2
DavidZagury wants to merge 1 commit intomasterfrom
sku-msn4600

Conversation

@DavidZagury
Copy link
Owner

@DavidZagury DavidZagury commented Feb 14, 2021

Why I did it

Support new Mellanox system SN4600

How I did it

Add relevant files to support new platform names platform x86_64-mlnx_msn4600 and default SKU ACS-MSN4600

How to verify it

Load the SN4600 switch, verify all ports are up with 200G by default.

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012

Description for the changelog

A picture of a cute animal (not mandatory but encouraged)

@liat-grozovik
Copy link

Suggest to change the PR title to something shorted such as "[Mellanox] Add support for SN4600 system"

@liat-grozovik
Copy link

liat-grozovik commented Feb 15, 2021

Suggest to rephrase the comment to something like
"

  • Why I did it
    Support new Mellanox system SN4600

  • How I did it
    Add relevant files to support new platform names platform x86_64-mlnx_msn4600 and default SKU ACS-MSN4600

  • How to verify it
    Load the SN4600 switch, verify all ports are up with 200G by default.

"

Note: this is needed for 202012 as well. so please mark it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why adding it in the middle and not in as the last one?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why all has -r0 and only SN4600 and SN4600C does not?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix for 4600, and talked with @shlomibitton and he will fix 4600C

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please send the file to Itai to confirm it. Not sure if he has one to use on SAI github but at least he need to confirm it

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We talked with Itay which told us that the file is already in github which I found here:
https://github.com/Mellanox/SAI-Implementation/blob/80937117fcd54e8df0581368f652f3664af6aecb/mlnx_sai/src/sai_4600.xml

But he did say it was a draft file.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why adding in the middle and not on the last?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The list was already ordered by the switch number and I wanted to keep this order

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for these numbers please review with Vadim P
we can only approve them after platform test is fully running on this system.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I talked with Vadim, he didn't have time for doing a full review but told me to check that it the same as the Leopard, I checked and they are only different on the fans and unk, for those change he said that fans are correct, and unk should be the same as Tigon which it is.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is do you need a leading number in the comment?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The platform_dict_thermal is supposed to have the index of the switch thermal_profile_list, I added those to help me find the index of my switch in the array.

I kept it because it having this information in front of our eyes helped me realize there is a mistake in other switches (x86_64-mlnx_msn3420-r0 having 9 as its index when in fact it is in the 11th place of the array, 4410 which is in the 11 place of the list is missing from this array at all).

I can remove it if it is not needed

@DavidZagury DavidZagury changed the title [Mellanox] Add a new Mellanox platform x86_64-mlnx_msn4600 and new SKU ACS-MSN4600 [Mellanox] Add support for SN4600 system Feb 15, 2021
Copy link

@shlomibitton shlomibitton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As commented

@DavidZagury DavidZagury deleted the sku-msn4600 branch March 25, 2021 15:17
DavidZagury pushed a commit that referenced this pull request Dec 8, 2021
Upgrade DellEMC platforms to bullseye.
DavidZagury pushed a commit that referenced this pull request Nov 17, 2022
#### Why I did it
Update sonic-host-services submodule to include below commits:
```
bc8698d Merge pull request #21 from abdosi/feature
557a110 Fix the issue where if dest port is not specified in ACL rule than for multi-asic where we create NAT rule to forward traffic from Namespace to host fail with exception.
6e45acc (master) Merge pull request #14 from abdosi/feature
4d6cad7 Merge remote-tracking branch 'upstream/master' into feature
bceb13e Install libyang to azure pipeline (#20)
82299f5 Merge pull request #13 from SuvarnaMeenakshi/cacl_fabricns
15d3bf4 Merge branch 'master' into cacl_fabricns
de54082 Merge pull request #16 from ZhaohuiS/feature/caclmgrd_external_client_warning_log
b4b368d Add warning log if destination port is not defined
d4bb96d Merge branch 'master' into cacl_fabricns
35c76cb Add unit-test and fix typo.
17d44c2 Made Changes to be Python 3.7 compatible
978afb5 Aligning Code
1fbf8fb Merge remote-tracking branch 'upstream/master' into feature
7b8c7d1 Added UT for the changes
91c4c42 Merge pull request #9 from ZhaohuiS/feature/caclmgrd_external_client
7c0b56a Add 4 test cases for external_client_acl, including single port and port range for ipv4 and ipv6
b71e507 Merge remote-tracking branch 'origin/master' into HEAD
d992dc0 Merge branch 'master' into feature/caclmgrd_external_client
bd7b172 DST_PORT is configuralbe in json config file for EXTERNAL_CLIENT_ACL
f9af7ae [CLI] Move hostname, mgmt interface/vrf config to hostcfgd (#2)
70ce6a3 Merge pull request #10 from sujinmkang/cold_reset
29be8d2 Added Support to render Feature Table using Device running metadata. Also added support to render 'has_asic_scope' field of Feature Table.
3437e35 [caclmgrd][chassis]: Add ip tables rules to accept internal docker traffic from fabric asic namespaces.
8720561 Fix and add hardware reboot cause determination tests
0dcc7fe remove the empty bracket if no hardware reboot cause minor
e47d831 fix the wrong expected result comparision
ef86b53 Fix startswith Attribute error
8a630bb fix mock patch
8543ddf update the reboot cause logic and update the unit test
53ad7cd fix the mock patch function
7c8003d fix the reboot-cause regix for test
1ba611f fix typo
25379d3 Add unit test case
a56133b Add hardware reboot cause as actual reboot cause for soft reboot failed
c7d3833 Support Restapi/gnmi control plane acls
f6ea036 caclmgrd: Don't block traffic to mgmt by default
a712fc4 Update test cases
adc058b caclmgrd: Don't block traffic to mgmt by default
06ff918 Merge pull request #7 from bluecmd/patch-1
e3e23bc ci: Rename sonic-buildimage repository
e83a858 Merge pull request #4 from kamelnetworks/acl-ip2me-test
f5a2e50 [caclmgrd]: Tests for IP2ME rules generation
```
DavidZagury pushed a commit that referenced this pull request Dec 10, 2024
To fix a statistical issue. The original fix was done in FRRouting/frr#17297. However to accommodate 8.5.4 the patch in the PR was added.

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/frr/zebra -A 127.0.0.1 -s 90000000 -M dplane_fpm_nl -M snmp'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
[Current thread is 1 (Thread 0x7fccd6faf7c0 (LWP 36))]
(gdb) bt
#0  0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007fccd7302fb2 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007fccd72ed472 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x00007fccd75bb3a9 in _zlog_assert_failed (xref=xref@entry=0x7fccd7652380 <_xref.16>, extra=extra@entry=0x0) at ../lib/zlog.c:678
#4  0x00007fccd759b2fe in route_node_delete (node=<optimized out>) at ../lib/table.c:352
#5  0x00007fccd759b445 in route_unlock_node (node=0x0) at ../lib/table.h:258
#6  route_next (node=<optimized out>) at ../lib/table.c:436
#7  route_next (node=node@entry=0x56029d89e560) at ../lib/table.c:410
#8  0x000056029b6b6b7a in if_lookup_by_name_per_ns (ns=ns@entry=0x56029d873d90, ifname=ifname@entry=0x7fccc0029340 "PortChannel1020")
    at ../zebra/interface.c:312
#9  0x000056029b6b8b36 in zebra_if_dplane_ifp_handling (ctx=0x7fccc0029310) at ../zebra/interface.c:1867
#10 zebra_if_dplane_result (ctx=0x7fccc0029310) at ../zebra/interface.c:2221
#11 0x000056029b7137a9 in rib_process_dplane_results (thread=<optimized out>) at ../zebra/zebra_rib.c:4810
#12 0x00007fccd75a0e0d in thread_call (thread=thread@entry=0x7ffe8e553cc0) at ../lib/thread.c:1990
#13 0x00007fccd7559368 in frr_run (master=0x56029d65a040) at ../lib/libfrr.c:1198
#14 0x000056029b6ac317 in main (argc=9, argv=0x7ffe8e5540d8) at ../zebra/main.c:478
DavidZagury pushed a commit that referenced this pull request Jan 21, 2025
…et#21095)

Adding the below fix from FRR FRRouting/frr#17297

This is to fix the following crash which is a statistical issue

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/frr/zebra -A 127.0.0.1 -s 90000000 -M dplane_fpm_nl -M snmp'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
[Current thread is 1 (Thread 0x7fccd6faf7c0 (LWP 36))]
(gdb) bt
#0  0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007fccd7302fb2 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007fccd72ed472 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x00007fccd75bb3a9 in _zlog_assert_failed (xref=xref@entry=0x7fccd7652380 <_xref.16>, extra=extra@entry=0x0) at ../lib/zlog.c:678
#4  0x00007fccd759b2fe in route_node_delete (node=<optimized out>) at ../lib/table.c:352
#5  0x00007fccd759b445 in route_unlock_node (node=0x0) at ../lib/table.h:258
#6  route_next (node=<optimized out>) at ../lib/table.c:436
#7  route_next (node=node@entry=0x56029d89e560) at ../lib/table.c:410
#8  0x000056029b6b6b7a in if_lookup_by_name_per_ns (ns=ns@entry=0x56029d873d90, ifname=ifname@entry=0x7fccc0029340 "PortChannel1020")
    at ../zebra/interface.c:312
#9  0x000056029b6b8b36 in zebra_if_dplane_ifp_handling (ctx=0x7fccc0029310) at ../zebra/interface.c:1867
#10 zebra_if_dplane_result (ctx=0x7fccc0029310) at ../zebra/interface.c:2221
#11 0x000056029b7137a9 in rib_process_dplane_results (thread=<optimized out>) at ../zebra/zebra_rib.c:4810
#12 0x00007fccd75a0e0d in thread_call (thread=thread@entry=0x7ffe8e553cc0) at ../lib/thread.c:1990
#13 0x00007fccd7559368 in frr_run (master=0x56029d65a040) at ../lib/libfrr.c:1198
#14 0x000056029b6ac317 in main (argc=9, argv=0x7ffe8e5540d8) at ../zebra/main.c:478
DavidZagury pushed a commit that referenced this pull request Feb 2, 2025
…et#21405)

<!--
 Please make sure you've read and understood our contributing guidelines:
 https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md

 failure_prs.log skip_prs.log Make sure all your commits include a signature generated with `git commit -s` **

 If this is a bug fix, make sure your description includes "fixes #xxxx", or
 "closes #xxxx" or "resolves #xxxx"

 Please provide the following information:
-->

#### Why I did it

Adding the below fix from FRR FRRouting/frr#17297

This is to fix the following crash which is a statistical issue

```
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/frr/zebra -A 127.0.0.1 -s 90000000 -M dplane_fpm_nl -M snmp'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
[Current thread is 1 (Thread 0x7fccd6faf7c0 (LWP 36))]
(gdb) bt
#0 0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007fccd7302fb2 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x00007fccd72ed472 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#3 0x00007fccd75bb3a9 in _zlog_assert_failed (xref=xref@entry=0x7fccd7652380 <_xref.16>, extra=extra@entry=0x0) at ../lib/zlog.c:678
#4 0x00007fccd759b2fe in route_node_delete (node=<optimized out>) at ../lib/table.c:352
#5 0x00007fccd759b445 in route_unlock_node (node=0x0) at ../lib/table.h:258
#6 route_next (node=<optimized out>) at ../lib/table.c:436
#7 route_next (node=node@entry=0x56029d89e560) at ../lib/table.c:410
#8 0x000056029b6b6b7a in if_lookup_by_name_per_ns (ns=ns@entry=0x56029d873d90, ifname=ifname@entry=0x7fccc0029340 "PortChannel1020")
 at ../zebra/interface.c:312
#9 0x000056029b6b8b36 in zebra_if_dplane_ifp_handling (ctx=0x7fccc0029310) at ../zebra/interface.c:1867
#10 zebra_if_dplane_result (ctx=0x7fccc0029310) at ../zebra/interface.c:2221
#11 0x000056029b7137a9 in rib_process_dplane_results (thread=<optimized out>) at ../zebra/zebra_rib.c:4810
#12 0x00007fccd75a0e0d in thread_call (thread=thread@entry=0x7ffe8e553cc0) at ../lib/thread.c:1990
#13 0x00007fccd7559368 in frr_run (master=0x56029d65a040) at ../lib/libfrr.c:1198
#14 0x000056029b6ac317 in main (argc=9, argv=0x7ffe8e5540d8) at ../zebra/main.c:478
```

##### Work item tracking
- Microsoft ADO **(number only)**:

#### How I did it
Added patch.

#### How to verify it
Running BGP tests.

<!--
If PR needs to be backported, then the PR must be tested against the base branch and the earliest backport release branch and provide tested image version on these two branches. For example, if the PR is requested for master, 202211 and 202012, then the requester needs to provide test results on master and 202012.
-->

#### Which release branch to backport (provide reason below if selected)

<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->

- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [ ] 202205
- [ ] 202211
- [ ] 202305

#### Tested branch (Please provide the tested image version)

<!--
- Please provide tested image version
- e.g.
- [x] 20201231.100
-->

- [ ] <!-- image version 1 -->
- [ ] <!-- image version 2 -->

#### Description for the changelog
<!--
Write a short (one line) summary that describes the changes in this
pull request for inclusion in the changelog:
-->

<!--
 Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.
-->

#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->

#### A picture of a cute animal (not mandatory but encouraged)
DavidZagury pushed a commit that referenced this pull request Apr 28, 2025
<!--
     Please make sure you've read and understood our contributing guidelines:
     https://github.com/Azure/SONiC/blob/gh-pages/CONTRIBUTING.md

     ** Make sure all your commits include a signature generated with `git commit -s` **

     If this is a bug fix, make sure your description includes "fixes #xxxx", or
     "closes #xxxx" or "resolves #xxxx"

     Please provide the following information:
-->

#### Why I did it

Adding the below fix from FRR FRRouting/frr#17297

This is to fix the following crash which is a statistical issue

```
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/frr/zebra -A 127.0.0.1 -s 90000000 -M dplane_fpm_nl -M snmp'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
[Current thread is 1 (Thread 0x7fccd6faf7c0 (LWP 36))]
(gdb) bt
#0  0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007fccd7302fb2 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007fccd72ed472 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x00007fccd75bb3a9 in _zlog_assert_failed (xref=xref@entry=0x7fccd7652380 <_xref.16>, extra=extra@entry=0x0) at ../lib/zlog.c:678
#4  0x00007fccd759b2fe in route_node_delete (node=<optimized out>) at ../lib/table.c:352
#5  0x00007fccd759b445 in route_unlock_node (node=0x0) at ../lib/table.h:258
#6  route_next (node=<optimized out>) at ../lib/table.c:436
#7  route_next (node=node@entry=0x56029d89e560) at ../lib/table.c:410
#8  0x000056029b6b6b7a in if_lookup_by_name_per_ns (ns=ns@entry=0x56029d873d90, ifname=ifname@entry=0x7fccc0029340 "PortChannel1020")
    at ../zebra/interface.c:312
#9  0x000056029b6b8b36 in zebra_if_dplane_ifp_handling (ctx=0x7fccc0029310) at ../zebra/interface.c:1867
#10 zebra_if_dplane_result (ctx=0x7fccc0029310) at ../zebra/interface.c:2221
#11 0x000056029b7137a9 in rib_process_dplane_results (thread=<optimized out>) at ../zebra/zebra_rib.c:4810
#12 0x00007fccd75a0e0d in thread_call (thread=thread@entry=0x7ffe8e553cc0) at ../lib/thread.c:1990
#13 0x00007fccd7559368 in frr_run (master=0x56029d65a040) at ../lib/libfrr.c:1198
#14 0x000056029b6ac317 in main (argc=9, argv=0x7ffe8e5540d8) at ../zebra/main.c:478
```

##### Work item tracking
- Microsoft ADO **(number only)**:

#### How I did it
Added patch.

#### How to verify it
Running BGP tests.

<!--
If PR needs to be backported, then the PR must be tested against the base branch and the earliest backport release branch and provide tested image version on these two branches. For example, if the PR is requested for master, 202211 and 202012, then the requester needs to provide test results on master and 202012.
-->

#### Which release branch to backport (provide reason below if selected)

<!--
- Note we only backport fixes to a release branch, *not* features!
- Please also provide a reason for the backporting below.
- e.g.
- [x] 202006
-->

- [ ] 201811
- [ ] 201911
- [ ] 202006
- [ ] 202012
- [ ] 202106
- [ ] 202111
- [ ] 202205
- [ ] 202211
- [ ] 202305

#### Tested branch (Please provide the tested image version)

<!--
- Please provide tested image version
- e.g.
- [x] 20201231.100
-->

- [ ] <!-- image version 1 -->
- [ ] <!-- image version 2 -->

#### Description for the changelog
<!--
Write a short (one line) summary that describes the changes in this
pull request for inclusion in the changelog:
-->

<!--
 Ensure to add label/tag for the feature raised. example - PR#2174 under sonic-utilities repo. where, Generic Config and Update feature has been labelled as GCU.
-->

#### Link to config_db schema for YANG module changes
<!--
Provide a link to config_db schema for the table for which YANG model
is defined
Link should point to correct section on https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-yang-models/doc/Configuration.md
-->

#### A picture of a cute animal (not mandatory but encouraged)
DavidZagury pushed a commit that referenced this pull request May 4, 2025
…sue. (sonic-net#22342)

Fix TACACS config revert to old config when device reboot issue.

#### Why I did it
Fix following bug:

1. When SONiC OS upgrade, old TACACS config will save to /etc/sonic/old_config/tacacs.json
2. After device reboot, TACACS config service (https://github.com/sonic-net/sonic-buildimage/blob/master/files/build_templates/tacacs-config.service) will restore TACACS config from /etc/sonic/old_config/tacacs.json, but this file will keep no change after restore TACACS config.
3. If TACACS service changed by user, because of #2, if device reboot again, the TACACS config been reverted back to old config in /etc/sonic/old_config/tacacs.json

Note: the TACACS config does not revert immediately after reboot, it will delay 5min 30sec:
https://github.com/sonic-net/sonic-buildimage/blob/master/files/build_templates/tacacs-config.timer

##### Work item tracking
- Microsoft ADO **(number only)**:32338799

#### How I did it
Move /etc/sonic/old_config/tacacs.json to /etc/sonic/old_config/tacacs.json_backup

#### How to verify it
Pass all test case.
Manually verify with following steps:

admin@vlab-01:~$ show tacacs
TACPLUS global auth_type login
TACPLUS global timeout 5 (default)
TACPLUS global passkey testing123

TACPLUS_SERVER address 10.250.0.102
               priority 1
               tcp_port 49

admin@vlab-01:~$ echo '
> {
>     "TACPLUS": {"global": { "auth_type": "login", "passkey": "12345" } }
> }' > /etc/sonic/old_config/tacacs.json
admin@vlab-01:~$ cat /etc/sonic/old_config/tacacs.json

{
    "TACPLUS": {"global": { "auth_type": "login", "passkey": "12345" } }
}

// then reboot device and wait for 6 minutes, because the TACACS config service will delay 5min 30sec after reboot:
https://github.com/sonic-net/sonic-buildimage/blob/master/files/build_templates/tacacs-config.timer

admin@vlab-01:~$ ls /etc/sonic/old_config/tacacs.json
ls: cannot access '/etc/sonic/old_config/tacacs.json': No such file or directory
admin@vlab-01:~$ show tacacs
TACPLUS global auth_type login
TACPLUS global timeout 5 (default)
TACPLUS global passkey 12345

TACPLUS_SERVER address 10.250.0.102
               priority 1
               tcp_port 49

#### Description for the changelog
Fix TACACS config revert to old config when device reboot issue.
DavidZagury pushed a commit that referenced this pull request Mar 9, 2026
…net#25643)

* [build] Add build timing report and dependency analysis tools

Add three scripts for build performance instrumentation:

- scripts/build-timing-report.sh: Parse per-package timing from build
  logs (HEADER/FOOTER timestamps), generate sorted duration table,
  phase breakdown, parallelism timeline, and CSV export.

- scripts/build-dep-graph.py: Parse rules/*.mk dependency graph,
  compute critical path, fan-out/fan-in bottleneck analysis, and
  generate DOT/JSON output for visualization.

- scripts/build-resource-monitor.sh: Sample CPU, memory, disk I/O,
  and Docker container count during builds for resource utilization
  analysis.

Add "make build-report" target to slave.mk that runs the timing
report and dependency analysis after a build completes.

Example output from a VS build on 24-core/30GB machine:
- 210 packages built in 53m wall time (173m CPU)
- Max concurrency: 5 (with SONIC_CONFIG_BUILD_JOBS=4)
- Critical path: 14 packages deep (libnl -> libswsscommon -> utilities)
- Top bottleneck: LIBSWSSCOMMON with 48 downstream dependents

Signed-off-by: Rustiqly <[email protected]>

* Address Copilot review: fix 17 bugs in build analysis scripts

- Use free -m with division instead of free -g to avoid rounding (#1)
- Add = and ?= to Makefile dependency regex patterns (#2, #7)
- CPU calculation now uses /proc/stat delta (two reads) (#3, #14)
- Fix misleading 'critical path estimate' comment (#4)
- Fix parallelism timeline comment (60s not 10s) (#5)
- Include after-relationship packages in fan stats (#6)
- Guard disk I/O division by zero when INTERVAL<=1 (#8)
- Remove unused elapsed_line variable (#9)
- Remove redundant LIBSWSSCOMMON_DBG check (#10)
- Remove active_make_jobs from CSV header comment (#11)
- Wire up _RDEPENDS parsing to build reverse deps (#12)
- Remove unnecessary 'if v' filter on rdeps JSON (#13)
- Remove unused REPORT_FORMAT parameter (#15)
- Add cycle detection to critical path algorithm (#16)
- Add execute permission check for companion scripts (#17)

Signed-off-by: Rustiqly <[email protected]>

---------

Signed-off-by: Rustiqly <[email protected]>
Co-authored-by: Rustiqly <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants