Skip to content

[libyang bug] workaround leaf-list via uses bug in BGP route-map#21078

Closed
bradh352 wants to merge 1 commit intosonic-net:masterfrom
bradh352:libyang-bug-workaround
Closed

[libyang bug] workaround leaf-list via uses bug in BGP route-map#21078
bradh352 wants to merge 1 commit intosonic-net:masterfrom
bradh352:libyang-bug-workaround

Conversation

@bradh352
Copy link
Copy Markdown
Collaborator

@bradh352 bradh352 commented Dec 6, 2024

Why I did it

There is a bug in libyang when attempting to validate table BGP_PEER_GROUP_AF and likely BGP_NEIGHBOR_AF in respect to route_map_in and route_map_out.

The issue is with the use of leaf-list specifically when it is pulled in via a uses clause of bgpcmn:sonic-bgp-cmn-af.

The error message resembles that of if a child is specified that isn't recognized at all:

All Keys are not parsed in BGP_PEER_GROUP_AF
dict_keys(['default|PEERS|ipv4_unicast'])
exceptionList:["'route_map_in'"]
Work item tracking

How I did it

Moving the leaf-list to the parent rather than being imported through the uses clause works around this issue.

Again, this is a bug in libyang itself, and the same block in sonic-bgp-cmn-af was simply moved to the parents and it magically fixes the issue.

How to verify it

Relevant config section to cause the issue (probably included too much here, but none-the-less effective):

{
    "DEVICE_METADATA": {
        "localhost": {
           {# ... #}
            "docker_routing_config_mode": "unified",
            "frr_mgmt_framework_config": "true"
           {# ... #}
        }
    },
    "BGP_GLOBALS": {
        "default": {
            "load_balance_mp_relax": "true",
            "local_asn": "4210000001",
            "log_nbr_state_changes": "true",
            "router_id": "172.16.0.1"
        }
    },
    "BGP_GLOBALS_AF": {
        "default|ipv4_unicast": {
            "max_ebgp_paths": "2"
        },
        "default|ipv6_unicast": {
            "max_ebgp_paths": "2"
        },
        "default|l2vpn_evpn": {
            "advertise-all-vni": "true"
        }
    },
    "BGP_NEIGHBOR": {
        "default|Ethernet72": {
            "peer_group_name": "PEERS"
        }
    },
    "BGP_PEER_GROUP": {
        "default|PEERS": {
            "bfd": "true",
            "capability_ext_nexthop": "true",
            "ebgp_multihop": "true",
            "holdtime": "9",
            "keepalive": "3",
            "min_adv_interval": "5",
            "peer_type": "external"
        }
    },
    "BGP_PEER_GROUP_AF": {
        "default|PEERS|ipv4_unicast": {
            "admin_status": "up",
            "route_map_in": [
                "ALLOW"
            ],
            "route_map_out": [
                "ALLOW"
            ]
        },
        "default|PEERS|ipv6_unicast": {
            "admin_status": "up",
            "route_map_in": [
                "ALLOW"
            ],
            "route_map_out": [
                "ALLOW"
            ]
        },
        "default|PEERS|l2vpn_evpn": {
            "admin_status": "up",
            "route_map_in": [
                "ALLOW"
            ],
            "route_map_out": [
                "ALLOW"
            ],
            "unchanged_nexthop": "true"
        }
    },
    "ROUTE_MAP": {
        "ALLOW|1": {
            "route_operation": "permit"
        }
    },
    "ROUTE_MAP_SET": {
        "ALLOW": {}
    }
}

Which release branch to backport (provide reason below if selected)

  • 202411

Tested branch (Please provide the tested image version)

master as of 20241206

Description for the changelog

[libyang bug] workaround leaf-list via uses bug in BGP route-map

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

Signed-off-by: Brad House (@bradh352)

@bradh352 bradh352 requested a review from qiluo-msft as a code owner December 6, 2024 22:23
@bradh352 bradh352 force-pushed the libyang-bug-workaround branch from 9369531 to 1b3faeb Compare December 11, 2024 21:58
@bradh352
Copy link
Copy Markdown
Collaborator Author

rebased to force rebuild to see if general sonic CI tests are working yet

@bradh352
Copy link
Copy Markdown
Collaborator Author

@qiluo-msft @lguohan please review

@bradh352 bradh352 force-pushed the libyang-bug-workaround branch from 1b3faeb to 806a745 Compare December 15, 2024 19:43
@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@bradh352
Copy link
Copy Markdown
Collaborator Author

rebased to try to force rebuild since last build hung

@bradh352 bradh352 force-pushed the libyang-bug-workaround branch from 806a745 to aca9109 Compare December 19, 2024 12:40
@bradh352
Copy link
Copy Markdown
Collaborator Author

rebased again since last build failed in unrelated test

@mssonicbld
Copy link
Copy Markdown
Collaborator

/azp run Azure.sonic-buildimage

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@lguohan lguohan added the YANG YANG model related changes label Dec 19, 2024
bluecmd pushed a commit to kamelnetworks/sonic-buildimage that referenced this pull request Dec 24, 2024
…et#21078)

There is a bug in libyang when attempting to use table `BGP_PEER_GROUP_AF`
and likely `BGP_NEIGHBOR_AF` in respect to `route_map_in` and `route_map_out`.

The issue is with the use of `leaf-list` specifically when it is pulled in
via a `uses` clause `bgpcmn:sonic-bgp-cmn-af`.

The error message resembles that of if a child is specified that isn't
recognized at all:
```
All Keys are not parsed in BGP_PEER_GROUP_AF
dict_keys(['default|PEERS|ipv4_unicast'])
exceptionList:["'route_map_in'"]
```

Moving the leaf-list to the parent rather than being imported
through the `uses` clause works around this issue.

Signed-off-by: Brad House (@bradh352)
bradh352 added a commit to bradh352/sonic-buildimage that referenced this pull request Dec 24, 2024
There is a bug in libyang when attempting to use table `BGP_PEER_GROUP_AF`
and likely `BGP_NEIGHBOR_AF` in respect to `route_map_in` and `route_map_out`.

The issue is with the use of `leaf-list` specifically when it is pulled in
via a `uses` clause `bgpcmn:sonic-bgp-cmn-af`.

The error message resembles that of if a child is specified that isn't
recognized at all:
```
All Keys are not parsed in BGP_PEER_GROUP_AF
dict_keys(['default|PEERS|ipv4_unicast'])
exceptionList:["'route_map_in'"]
```

Moving the leaf-list to the parent rather than being imported
through the `uses` clause works around this issue.

Signed-off-by: Brad House (@bradh352)
bluecmd pushed a commit to kamelnetworks/sonic-buildimage that referenced this pull request Dec 28, 2024
…et#21078)

There is a bug in libyang when attempting to use table `BGP_PEER_GROUP_AF`
and likely `BGP_NEIGHBOR_AF` in respect to `route_map_in` and `route_map_out`.

The issue is with the use of `leaf-list` specifically when it is pulled in
via a `uses` clause `bgpcmn:sonic-bgp-cmn-af`.

The error message resembles that of if a child is specified that isn't
recognized at all:
```
All Keys are not parsed in BGP_PEER_GROUP_AF
dict_keys(['default|PEERS|ipv4_unicast'])
exceptionList:["'route_map_in'"]
```

Moving the leaf-list to the parent rather than being imported
through the `uses` clause works around this issue.

Signed-off-by: Brad House (@bradh352)
bradh352 added a commit to bradh352/sonic-buildimage that referenced this pull request Jan 17, 2025
There is a bug in libyang when attempting to use table `BGP_PEER_GROUP_AF`
and likely `BGP_NEIGHBOR_AF` in respect to `route_map_in` and `route_map_out`.

The issue is with the use of `leaf-list` specifically when it is pulled in
via a `uses` clause `bgpcmn:sonic-bgp-cmn-af`.

The error message resembles that of if a child is specified that isn't
recognized at all:
```
All Keys are not parsed in BGP_PEER_GROUP_AF
dict_keys(['default|PEERS|ipv4_unicast'])
exceptionList:["'route_map_in'"]
```

Moving the leaf-list to the parent rather than being imported
through the `uses` clause works around this issue.

Signed-off-by: Brad House (@bradh352)
@ganglyu
Copy link
Copy Markdown
Contributor

ganglyu commented Feb 5, 2025

would you please add unit test?

@bradh352
Copy link
Copy Markdown
Collaborator Author

bradh352 commented Feb 6, 2025

@ganglyu

would you please add unit test?

I'll come back to this issue once I know if there's any appetite to fix this the right way by upgrading to libyang3 https://lists.sonicfoundation.dev/g/sonic-dev/message/938

I'm offering to do the heavy lifting on it, but if I can't get guarantees on it ever actually getting reviewed and merged (in a timely manner), then I may need to drop it.

@ganglyu
Copy link
Copy Markdown
Contributor

ganglyu commented Feb 6, 2025

@bradh352

I don't think we can upgrade libyang. SONiC relies on a feature that requires backlinks from libyang, which were removed in versions after 1.0.73

@bradh352
Copy link
Copy Markdown
Collaborator Author

bradh352 commented Feb 7, 2025

@ganglyu

I don't think we can upgrade libyang. SONiC relies on a feature that requires backlinks from libyang, which were removed in versions after 1.0.73

I agree that I do not see a replacement for what the leaf backlinks member previously provided. It may be there of course, but not really documented. From what I can tell, we can just use lysc_module_dfs_full() to scan the entire tree, use lysc_node_lref_target() and lysc_path() to generate the paths and make our own backlink structure and build a cache.... or still use that function but only store results for our desired base path.

How fast would this be? Not sure, but I can't imagine it being that bad just walking the entire tree once ... granted this would be happening in Python not C code. Also supposedly libyang2/3 are much faster than libyang1, so we may see a performance increase still even if we have to add some additional overhead.

Copy link
Copy Markdown
Contributor

@wen587 wen587 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ganglyu
Copy link
Copy Markdown
Contributor

ganglyu commented Feb 7, 2025

@bradh352
You are welcome to explore and contribute.
LinkedIn tried this for several months but stopped. Their PR might still be useful for you:
#9545

@bradh352
Copy link
Copy Markdown
Collaborator Author

bradh352 commented Feb 7, 2025

@ganglyu

@bradh352 You are welcome to explore and contribute. LinkedIn tried this for several months but stopped. Their PR might still be useful for you: #9545

Thanks for the link, I hadn't seen that ticket.

Months?? That seems ... strange. Granted, they were trying to do it with libyang1 so that could have been their roadblock.

I will have to send some patches to libyang-python upstream to support calling into the library functions to accomplish this, but they accepted my memory leak patch pretty quickly already so I'm not foreseeing any issues.

Is there a team I need to be in communication with to review my PRs when ready? I sent an email to the sonic-dev mailing list about this project and it was met with silence.

@ganglyu
Copy link
Copy Markdown
Contributor

ganglyu commented Feb 7, 2025

@ganglyu

Thanks for the link, I hadn't seen that ticket.

Months?? That seems ... strange. Granted, they were trying to do it with libyang1 so that could have been their roadblock.

I will have to send some patches to libyang-python upstream to support calling into the library functions to accomplish this, but they accepted my memory leak patch pretty quickly already so I'm not foreseeing any issues.

Is there a team I need to be in communication with to review my PRs when ready? I sent an email to the sonic-dev mailing list about this project and it was met with silence.

You can request Qiluo and me, and we will add the necessary reviewers.

@bradh352
Copy link
Copy Markdown
Collaborator Author

bradh352 commented Feb 7, 2025

@ganglyu

request Qiluo and me, and we will add the necessary reviewers.

I'll take you up on that, you should have a standalone PR which is step 1 to pull libyang3 and libyang3-py3 into sonic-buildimage today. I'm working on splitting out those commits into a separate branch in my fork since just that migration is fairly sizeable and would make it easier to review the other stuff if that's not in the same PR.

@bradh352
Copy link
Copy Markdown
Collaborator Author

bradh352 commented Feb 8, 2025

@ganglyu please review #21679 which is the first in a set of PRs for what I was mentioning :)

@bradh352
Copy link
Copy Markdown
Collaborator Author

bradh352 commented Mar 3, 2025

@ganglyu interestingly, this bug is still present after upgrading to libyang3.

After further research, it appears this is due to a bug in sonic-yang-mgmt. It tries to translate config_db.json format to a valid yang data format, and in doing so it uses the raw parsed yang schema to try to determine how it needs to be formatted. Being that it is a raw format (rather than the compiled schema), it then has to merge things like 'uses' clauses, and is only merging leaf nodes and not leaf-list nodes, hence the issue.

It shouldn't be a big deal to fix this oversight, but honestly the overall approach it is taken probably needs to be rethought at some point.

sonic-mgmt-common also does this conversion, but in a different way, and in Golang. As of the pending libyang3 PRs, at least that is occurring on the compiled schema format so it doesn't have to do things like uses clause evaluation.

Considering how complex this process is, it would probably be beneficial to have this conversion done in one place that is shared .... that said being one is in python and one is in Golang, that is easier said than done.

@ganglyu
Copy link
Copy Markdown
Contributor

ganglyu commented Mar 3, 2025

@bradh352
I'm not familiar with sonic-yang-mgmt, could you please discuss this issue in the UMF work group meeting?

@bradh352
Copy link
Copy Markdown
Collaborator Author

bradh352 commented Mar 5, 2025

closing, proper fix in #21907 ganglyu you previously approved this PR. I'm going to close it since an actual fix is now available. Please review that PR and approve.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

YANG YANG model related changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants