Skip to content

Feature/connector#277

Merged
jbaublitz merged 2 commits intojbaublitz:mainfrom
Bben01:feature/connector
Jun 27, 2025
Merged

Feature/connector#277
jbaublitz merged 2 commits intojbaublitz:mainfrom
Bben01:feature/connector

Conversation

@Bben01
Copy link
Contributor

@Bben01 Bben01 commented May 29, 2025

Close #275

So, a very early stage of the PR, but a working connector.

A few things to notice:

  1. I commented the payload parsing because currently I use the Nlmsghdr to get the response from the socket, but with the connector protocol, a valid response comes with nl_type of Done, so the Payload parsed it as an error
  2. In CnMsg, I flatten the struct cb_id from the kernel, but I don't really know if I should do it or not
struct cb_id {
	__u32 idx;
	__u32 val;
};

struct cn_msg {
	struct cb_id id;

	__u32 seq;
	__u32 ack;

	__u16 len;		/* Length of the following data */
	__u16 flags;
	__u8 data[];
};

To:

#[derive(
    Builder, Getters, Clone, Debug, PartialEq, Eq, Size, ToBytes, FromBytesWithInput, Header,
)]
#[neli(from_bytes_bound = "P: Size + FromBytesWithInput<Input = usize>")]
#[builder(build_fn(skip))]
#[builder(pattern = "owned")]
pub struct CnMsg<P> {
    /// Index of the connector (idx)
    #[getset(get = "pub")]
    idx: u32,
    /// Value (val)
    #[getset(get = "pub")]
    val: u32,
    /// Sequence number
    #[getset(get = "pub")]
    seq: u32,
    /// Acknowledgement number
    #[getset(get = "pub")]
    ack: u32,
    /// Length of the payload
    #[builder(setter(skip))]
    #[getset(get = "pub")]
    len: u16,
    /// Flags
    #[getset(get = "pub")]
    flags: u16,
    /// Payload of netlink message
    #[neli(size = "len as usize")]
    #[neli(input = "(len as usize)")]
    #[getset(get = "pub")]
    pub(crate) payload: P,
}
  1. Currently I only support ProcEvents from the connector, but it can actually have different payloads, based on the idx and val.
  2. I need to check that idx and val are correct, otherwise I need to return an error
  3. I don't really know if the parsing of the response is good enough or should be refactored, but from what I can understand, it cannot be automatically derived (the C code contains a number + union (instead of a more idiomatic enum in rust)
  4. I still need to define al the consts, but I will do it after the PR in libc lands (Add constants from linux/cn_proc.h and linux/connector.h rust-lang/libc#4434)
  5. Better docs + example

@Bben01 Bben01 mentioned this pull request May 29, 2025
@Bben01 Bben01 force-pushed the feature/connector branch 11 times, most recently from b3230d0 to dfbbf3f Compare May 30, 2025 14:03
@jbaublitz
Copy link
Owner

First of all, thank you so much for your work on this! I really appreciate the time you've put into this.

Close #275

So, a very early stage of the PR, but a working connector.

A few things to notice:

1. I commented the payload parsing because currently I use the `Nlmsghdr` to get the response from the socket, but with the connector protocol, a valid response comes with `nl_type` of Done, so the Payload parsed it as an error\

I've recently touched this code and I'll have to review it, but I would expect a payload of Empty if there is an error code of 0.

2. In CnMsg, I flatten the struct `cb_id` from the kernel, but I don't really know if I should do it or not
struct cb_id {
	__u32 idx;
	__u32 val;
};

struct cn_msg {
	struct cb_id id;

	__u32 seq;
	__u32 ack;

	__u16 len;		/* Length of the following data */
	__u16 flags;
	__u8 data[];
};

To:

#[derive(
    Builder, Getters, Clone, Debug, PartialEq, Eq, Size, ToBytes, FromBytesWithInput, Header,
)]
#[neli(from_bytes_bound = "P: Size + FromBytesWithInput<Input = usize>")]
#[builder(build_fn(skip))]
#[builder(pattern = "owned")]
pub struct CnMsg<P> {
    /// Index of the connector (idx)
    #[getset(get = "pub")]
    idx: u32,
    /// Value (val)
    #[getset(get = "pub")]
    val: u32,
    /// Sequence number
    #[getset(get = "pub")]
    seq: u32,
    /// Acknowledgement number
    #[getset(get = "pub")]
    ack: u32,
    /// Length of the payload
    #[builder(setter(skip))]
    #[getset(get = "pub")]
    len: u16,
    /// Flags
    #[getset(get = "pub")]
    flags: u16,
    /// Payload of netlink message
    #[neli(size = "len as usize")]
    #[neli(input = "(len as usize)")]
    #[getset(get = "pub")]
    pub(crate) payload: P,
}

I calculated the alignment and my only concern here would be on architectures other than x86. For x86, this looks reasonable.

3. Currently I only support ProcEvents from the connector, but it can actually have different payloads, based on the idx and val.

4. I need to check that idx and val are correct, otherwise I need to return an error

5. I don't really know if the parsing of the response is good enough or should be refactored, but from what I can understand, it cannot be automatically derived (the C code contains a number + union (instead of a more idiomatic enum in rust)

Can you give me a little bit more detail here? There may still be a way to make this work.

6. I still need to define al the consts, but I will do it after the PR in libc lands ([Add constants from linux/cn_proc.h and linux/connector.h rust-lang/libc#4434](https://github.com/rust-lang/libc/pull/4434))

7. Better docs + example

@Bben01
Copy link
Contributor Author

Bben01 commented May 31, 2025

I've recently touched this code and I'll have to review it, but I would expect a payload of Empty if there is an error code of 0.

I don't understand what you mean. What I am saying is that the Nlmsghdr.nl_type of the response in the connector protocol will have a value of 3 (Nlmsg::Done), so when the payload is parsed, on this line (let ty_const: u16 = input_type.into();), the value will be Nlmsg::Done, and that will then go to the first branch in the next if, so it won't parse the payload correctly.

I calculated the alignment and my only concern here would be on architectures other than x86. For x86, this looks reasonable.

So, should I change it or not?

Can you give me a little bit more detail here? There may still be a way to make this work.

The connector protocol is used as a generic kernel to user space communication, each type has its own payload.

Currently, these are the different types that the kernel supports:

pub const CN_IDX_PROC: c_uint = 0x1;
pub const CN_VAL_PROC: c_uint = 0x1;
pub const CN_IDX_CIFS: c_uint = 0x2;
pub const CN_VAL_CIFS: c_uint = 0x1;
pub const CN_W1_IDX: c_uint = 0x3;
pub const CN_W1_VAL: c_uint = 0x1;
pub const CN_IDX_V86D: c_uint = 0x4;
pub const CN_VAL_V86D_UVESAFB: c_uint = 0x1;
pub const CN_IDX_BB: c_uint = 0x5;
pub const CN_DST_IDX: c_uint = 0x6;
pub const CN_DST_VAL: c_uint = 0x1;
pub const CN_IDX_DM: c_uint = 0x7;
pub const CN_VAL_DM_USERSPACE_LOG: c_uint = 0x1;
pub const CN_IDX_DRBD: c_uint = 0x8;
pub const CN_VAL_DRBD: c_uint = 0x1;
pub const CN_KVP_IDX: c_uint = 0x9;
pub const CN_KVP_VAL: c_uint = 0x1;
pub const CN_VSS_IDX: c_uint = 0xA;
pub const CN_VSS_VAL: c_uint = 0x1;

I've only worked with proc connectors, so I do not know how the other works, but I suspect it won't be too different.

I could maybe have another option that won't parse the payload if the type is unknown, and instead will keep the raw bytes.

@jbaublitz
Copy link
Owner

I've recently touched this code and I'll have to review it, but I would expect a payload of Empty if there is an error code of 0.

I don't understand what you mean. What I am saying is that the Nlmsghdr.nl_type of the response in the connector protocol will have a value of 3 (Nlmsg::Done), so when the payload is parsed, on this line (let ty_const: u16 = input_type.into();), the value will be Nlmsg::Done, and that will then go to the first branch in the next if, so it won't parse the payload correctly.

Ah, now I understand what you mean. Can you outline what payload you expect in the Done packet? I'm happy to change the parsing a bit to be a bit more flexible.

I calculated the alignment and my only concern here would be on architectures other than x86. For x86, this looks reasonable.

So, should I change it or not?

I'm currently asking some colleagues more knowledgeable on architecture than I am about this. In the Fedora space, @decathorpe and @sbrivio-rh opened an issue about s390x so they may have thoughts here.

Can you give me a little bit more detail here? There may still be a way to make this work.

The connector protocol is used as a generic kernel to user space communication, each type has its own payload.

Currently, these are the different types that the kernel supports:

pub const CN_IDX_PROC: c_uint = 0x1;
pub const CN_VAL_PROC: c_uint = 0x1;
pub const CN_IDX_CIFS: c_uint = 0x2;
pub const CN_VAL_CIFS: c_uint = 0x1;
pub const CN_W1_IDX: c_uint = 0x3;
pub const CN_W1_VAL: c_uint = 0x1;
pub const CN_IDX_V86D: c_uint = 0x4;
pub const CN_VAL_V86D_UVESAFB: c_uint = 0x1;
pub const CN_IDX_BB: c_uint = 0x5;
pub const CN_DST_IDX: c_uint = 0x6;
pub const CN_DST_VAL: c_uint = 0x1;
pub const CN_IDX_DM: c_uint = 0x7;
pub const CN_VAL_DM_USERSPACE_LOG: c_uint = 0x1;
pub const CN_IDX_DRBD: c_uint = 0x8;
pub const CN_VAL_DRBD: c_uint = 0x1;
pub const CN_KVP_IDX: c_uint = 0x9;
pub const CN_KVP_VAL: c_uint = 0x1;
pub const CN_VSS_IDX: c_uint = 0xA;
pub const CN_VSS_VAL: c_uint = 0x1;

I've only worked with proc connectors, so I do not know how the other works, but I suspect it won't be too different.

I could maybe have another option that won't parse the payload if the type is unknown, and instead will keep the raw bytes.

A pattern that may be helpful to look at for this is the attributes. I also kept the bytes and allowed parsing after the fact. That seems to have worked pretty well so far. Unfortunately, it's not quite as intuitive as just encoding it in a type parameter, but it's flexible and generally works well.

@Bben01
Copy link
Contributor Author

Bben01 commented Jun 2, 2025

Can you outline what payload you expect in the Done packet?

You can look at the example I added, in the recv call (CnMsg<ProcEventHeader>)

A pattern that may be helpful to look at for this is the attributes

Could you clarify this point? I think I didn't understand fully what you meant

@jbaublitz
Copy link
Owner

Can you outline what payload you expect in the Done packet?

You can look at the example I added, in the recv call

Will do.

A pattern that may be helpful to look at for this is the attributes

Could you clarify this point? I think I didn't understand fully what you meant

The generic netlink attributes (Nlattr) and routing attributes (Rtattr) follow a similar pattern to what you're describing here. You may be able to find some inspiration in the handling code for those data structures if you're stuck.

@Bben01 Bben01 marked this pull request as ready for review June 3, 2025 17:41
@Bben01
Copy link
Contributor Author

Bben01 commented Jun 3, 2025

I think I finished everything

About the other protocols, I think the best way to handle them is to let the user define the payloads in their own crates, then they will be able to use them with the CnMsg, by specifying the type they expect to receive

If someone in the future wants to add another connector to this crate, they will only need to add the specific payload parsing for that connector (this will be a non-breaking change)

I pointed that out in the module docs

I added a few static parsing tests, I cannot write any real world tests because proc connectors are only available in the root PID namespace (so, not in docker).

About the

Here are the remaining blockers:

  • Wait for the next libc release (the PR that I sent was recently merged and should land in a few weeks), then use the latest crates.io version instead of git dep)
  • Decide how do you want to fix the NlPayload parsing (I reduced a bit the change in src/nl.rs to a few lines, but this make tests in other modules fail, so we need to find a better solution)
  • The alignment/endianness question
  • Your CR

@Bben01 Bben01 force-pushed the feature/connector branch from 5b6ec02 to bcc9898 Compare June 3, 2025 18:04
@Bben01 Bben01 changed the title Draft: Feature/connector Feature/connector Jun 4, 2025
@jbaublitz jbaublitz added this to the neli-0.7.1 milestone Jun 11, 2025
@jbaublitz jbaublitz self-requested a review June 11, 2025 20:48
@jbaublitz
Copy link
Owner

Thank you so much for your thorough work on this. I am assigning a target of 0.7.1 to this PR. I will try to take a look soon, but currently I'm working on getting the release out and sorting out #273.

@Bben01
Copy link
Contributor Author

Bben01 commented Jun 12, 2025

You're welcome

I think you should look at the (small) change of the payload parsing in nl.rs before releasing v0.7.0 because the fix could potentially be breaking

@jbaublitz
Copy link
Owner

Do you mean the change in this PR?

@Bben01
Copy link
Contributor Author

Bben01 commented Jun 12, 2025

Yes

@jbaublitz
Copy link
Owner

jbaublitz commented Jun 17, 2025

@Bben01 I'm a little bit torn on this change because in the case of the DONE packet, it will require users to specify exactly which type they expect which may differ in the case of extended ACKs from the payload of the intermediate packets. I believe this will break in the case of calls like recv_all which require the payload to be the same for each packet received. I would personally prefer some sort of solution that is flexible enough to handle future cases as well. Perhaps I can provide you with an API for reading the payload of the DONE packet as the data structure of your choosing and, for the time being, I could leave it as an unparsed buffer. That would leave the flexibility both for your connector work and the extend ACK work that I just addressed in #262. What are your thoughts?

@decathorpe
Copy link

I'm currently asking some colleagues more knowledgeable on architecture than I am about this. In the Fedora space, decathorpe and sbrivio-rh opened an issue about s390x so they may have thoughts here.

Sorry, my experience with s390x is limited to debugging endianness issues in various Fedora packages too :) If you need hardware access (over SSH) to an actual s390x machine for debugging and / or development, there are ways to request that through Fedora infrastructure.

@Bben01
Copy link
Contributor Author

Bben01 commented Jun 19, 2025

@jbaublitz I've looked a bit into that and I saw that the flags will contain NlmF::MULTI when this is part of a multi part message.

This means that we could potentially use that to parse the payload differently.

Currently we have no way to access the (parsed) flags from the payload, so I resorted to manually parsing them again from the buffer (for the POC, maybe there a better way of doing it).

I don't know if there is a better solution, but the at the end there is a finite amount of netlink protocols, so parsing their payload should always be possible from the payload code directly, but I am not knowledgeable enough on this topic.

@jbaublitz
Copy link
Owner

Fedora

@decathorpe Thanks for the offer. Currently QEMU s390x seems to be working better than I was expecting. Very slow but doable for what I need it for.

@jbaublitz
Copy link
Owner

@Bben01 Overall, I am happy with the code, but will provide a more detailed code review after I release 0.7.0. I tested your example code and the example code for DUMP directives and, while I will probably implement it differently in the 0.7.0 version, I see why you made the choice you did and I will try to preserve it in a way that doesn't require reparsing. I believe this format is slightly different due to the subscription model. It seems like that is why each packet is of type DONE and contains a payload of the expected type. Unfortunately, what I don't know is how this maps to other subscription-like netlink subsystems. I'm going to do some reading and try to see if I can find a similar protocol but honestly I would have expected an API like this to be exposed as a multicast group. That is the typical (and generally blessed way) of subscribing to a notification stream. I do support that in this library but unfortunately it seems like that is not the way it's implemented for this protocol in the kernel.

Copy link
Owner

@jbaublitz jbaublitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ready to accept the code. Can you just clean up the git history into preferably one or a few commits with a linear history based on main? I won't merge anything else until I merge this.

@Bben01 Bben01 force-pushed the feature/connector branch from aaa5e52 to 35d5b4b Compare June 27, 2025 05:40
@Bben01
Copy link
Contributor Author

Bben01 commented Jun 27, 2025

Thanks, I cleaned up the git history, and fixed the new clippy lints (two separate commits)

(Also, if possible, could you release a v0.7.1 when the PR is merged? I would like to use it in my crate)

@jbaublitz
Copy link
Owner

Thanks so much for this contribution! I believe the clippy errors are unrelated to your code, so I'm going to merge.

@jbaublitz jbaublitz merged commit 7d87ef7 into jbaublitz:main Jun 27, 2025
45 of 47 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Connector protocol

3 participants