-
Notifications
You must be signed in to change notification settings - Fork 2.1k
sys: provide over-the-air update mechanism #8902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I'm not sure if "secure" is the right term based on the current implementation. The proposal just verifies the integrity of the received image. Verification of integrity is only one aspect in the context of security. Why should the RIOT node trust the public key? The current proposal does not allow for downgrade of a firmware image. Is this useful? |
|
Did you look at other implementations and proposals, such as the standardization work being done by the IETF SUIT WG (https://datatracker.ietf.org/group/suit/about/)? |
Apart from the integrity check, through checking the signature, updates are authenticated, as only the owner of the public key can issue a valid update request. There are many threat vectors that are not covered by this simple mechanism, but IMO it provides enough security to actually be usable.
Why should any node trust any key?
That is a feature that can be added later, if needed. |
Yes, I've been watching both the MCUboot development and SUIT's draft. |
Also, downgrading is always possible by sending a newly signed update with an old version but higher image version. |
Do you want to say that only the owner of the private key that belongs to the public key can issue a valid signature? Still, as long as I cannot verify that the public key belongs to the right entity, the signature is useless.
You mean lines 81-89 in sys/include/firmware.h discuss the covered threat vectors? IMO I would not consider this as usable security.
Why do you use signatures?
No, the questions was whether this limitation leads to problems when deployed. |
The term "version" seems semantically overloaded: version of firmware and some kind of signing version (which is more or less a timestampt). Anyhow, your approach does not allow for the following scenario: A user gets a valid firmware image from a third party. The user updates the IoT device with a new version and then decides that he liked the previous version more. Downgrading to the previous version is not possible without interacting with the third party. |
|
Btw, looking into the description of the commits, I'm wondering why most of them are submitted under the PR caption "sys: provide secure over-the-air update mechanism". Many of the commits are independent of OTA, e.g., "unittests/tests-nanocoap: add coap_get_uri()" test or "nanocoap: add server-side block1 support", Just wondering. |
The approach considers the owner of the private key to be the user, and that user can thus downgrade. Downgrading by just making the bootloader boot the previous image is currently not suported, but as said, easily added (one flash page erase call). @waehlisch I think there's a misunderstanding here. This PR does not try to present a 100% secure firmware upgrade mechanism that fits every use case any hypothetical concept can. Many aspects of remote upgrading, e.g., policies on when to initiate a necessary reboot or multiple stakeholders, have been left out, in order to reduce complexity, and also to get something usable in code, and gain experience in actually implementing over-the-air upgrades. To get where this PR is now already taught a couple of lessons that any high-level discussions just ignore. So instead of bashing what the mechanism presented here cannot do, I'd rather concentrate on what it can do. I've removed the security claims from the PR title and description and will adapt the documentation for now. Better to provide certain security features and not claim to do so than claim them and not deliver on expectations. |
As stated, this PR depends on two CoAP PR's for blockwise transfer support. The commit list shows those commits, too. |
|
@waehlisch @kaspar030 let's not get over-excited about the term "security" here.
Long story short: this base clearly has value. Let's work together on improving and extending it (either in this PR or in subsequent PRs). |
|
@kaspar030 it is not about bashing, the discussion is about getting things clear.
As far as I understand, anyone can update the software on the IoT device because there is no authentification of the client who sends the update. The only thing that is needed is a valid firmware image (i.e., higher version number and signed by the private key that relates to the public key stored on the node/previous version) and connectivity. |
You mean, this would be solved by using DTLS? |
The way COSE+SUIT solve this is by requiring a correctly signed manifest to ensure that the manifest is generated by an authorized party and to require a manifest version number to be strictly monotonic to prevent replay attacks with an old manifest by an attacker. From what I read from the comments in the code, an identical construction is also in place here. |
| @@ -1,5 +1,6 @@ | |||
| /* | |||
| * Copyright (C) 2017 Inria | |||
| * Copyright (C) 2018 Kaspar Schleiser <[email protected]> | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the original author is @kYc0o, I would leave the 2017 Inria copyright on top and put your personal under. I saw that in other places.
aabadie
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A first pass a code review. I skipped the nanocoap parts. Finally, I have quite a few comments.
| { | ||
| rom (rx) : ORIGIN = _rom_start_addr + _boot_offset, LENGTH = _rom_length - _boot_offset | ||
| ram (rwx) : ORIGIN = _ram_start_addr, LENGTH = _ram_length | ||
| rom (rx) : ORIGIN = _rom_start_addr + _rom_offset, LENGTH = _rom_length |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why removing the _boot_offset (_rom_offset now) in the ROM length ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will break current support for pre-flashed bootloaders, you might end flashing a larger image than the space available. @kaspar030 what's your use case here?
| @@ -0,0 +1,61 @@ | |||
| /* | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about simply calling the parent directory bootloader ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you suggesting to change the name "riotboot"? there might be other bootloaders coming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think the bootloader should be at the root of RIOT. Even if other bootlaoders come, we'll have a very hard time to integrate them into the base code, at best, they'll come as packages. Any strong reason/motivation to keep it in dist? It's not even possible to compile it/use it outside RIOT...
| firmware_jump_to_slot(slot); | ||
| } | ||
|
|
||
| /* serious trouble! */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope noone ever finds out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If all the conditions to boot an image are not met, the bootloader boots nothing, which indeed, is serious trouble...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clarification, that I think should replace the 'serious trouble' comment for clarity. But then, in this case, why is the endless while loop required ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is intended to provide a defined error behaviour.
| extern "C" { | ||
| #endif | ||
|
|
||
| off_t fsize(const char *filename); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Undocumented functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
internal functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kaspar030 you might however prefix with _ isn't it? Though would look strange in a header... I'd say if it's internal we might change its place. Isn't doxygen complaining btw?
dist/tools/firmware/genkeys.c
Outdated
| return 1; | ||
| } | ||
|
|
||
| crypto_sign_keypair(pk,sk); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing space after ,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
| */ | ||
| void firmware_dump_slot_addrs(void); | ||
|
|
||
| extern const unsigned firmware_num_slots; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not documented (I'll check later what is it for)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
sys/net/application_layer/ota/coap.c
Outdated
| * @{ | ||
| * | ||
| * @file | ||
| * @brief Firmware update cia CoAP implementation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/cia/via
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
sys/net/application_layer/ota/coap.c
Outdated
| coap_block1_t block1; | ||
| int blockwise = coap_get_block1(pkt, &block1); | ||
|
|
||
| LOG_INFO("ota: received bytes %u-%u", \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
\ is not needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
sys/net/application_layer/ota/coap.c
Outdated
| } | ||
| else { | ||
| if (block1.offset == _state.offset) { | ||
| res = firmware_update_putbytes(&_state, block1.offset, pkt->payload, pkt->payload_len); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is a bit long and could be split
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
sys/firmware/firmware.c
Outdated
| CPU_FLASH_BASE + SLOT0_SIZE + SLOT1_SIZE | ||
| }; | ||
|
|
||
| const unsigned firmware_num_slots = sizeof(_firmware_slot_start)/sizeof(unsigned); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, it's here. Do you really need to retrieve (using extern) it in sys/firmware.h ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, riotboot needs it in its main().
c99c070 to
99468d8
Compare
examples/ota/README.md
Outdated
|
|
||
| - flash image and bootloader | ||
|
|
||
| $ "BOARD=samr21-xpro APP_VER=$(data +%s) make -j4 riotboot/flash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
date ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup, fixed
99468d8 to
5f79ae7
Compare
sys/include/firmware.h
Outdated
| * @param[in] pk NaCL public signing key to use | ||
| * | ||
| */ | ||
| int firmware_validate_metadata_signature(firmware_metadata_t *metadata, const unsigned char *pk); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to know what the return value means but unfortunately it's not documented.
- add comment to clarify firmware_current_slot() 1 loop start - use metadata->start_addr in firmware_get_image_startaddr()
558ccb6 to
714ba81
Compare
|
Hi all! |
|
@danpetry Oh cool, glad it's not abandoned. So one can just take an element from the diagram and try to make a PR out of it? |
|
@OYTIS some PRs are already being prepared, and I'm not sure what the state is of all work packages. I recommend introducing yourself on the issue and coordinating with the others - and maybe proposing something that you can help with? |
|
For the reference, I have been playing with this last night to get this working on an EFM32 (SLSTK3402a and SLSTK3401a). It did not work out of the box, but here are the changes I made. I disabled signatures since they don't work, and I was interested in the general principle, because I need it. Some general things I noticed:
|
|
Hi @basilfx ! Thanks for testing it! In general, the discussion I think needs to take place now in #9342, since I think this PR got a bit outdated. Otherwise, I have some comments:
This is actually not very good, and means that the clocking initialisation should be verified. I had similar issues at the very beginning of the implementation of the bootloader, which showed me that clocking must be carefully configured to allow re-initialisations. It's not very wanted to "#ifdef" parts in
@cladmi is working on that and #9514 and #8838 should help to get several files which are then passed to the flasher. We should then adapt JLink and openocd to flash them at the right place on the ROM. #9351 is also necessary to ease the task.
I agree with you on this, and I think it will be done in this way to only replace the needed slot. |
Thanks for your feedback. When integrating it, to make it testable, I would maybe represent it as a
So for JLINK you would need to do a "flash a big image with bootloader, metatada, firmware" instead of flashing 2 times ? |
So you mean that there is
No, JLink doesn't need one big image file. It can, however, flash multiple images in one 'session'. Currently, multiple 'sessions' are needed, and this sometimes fails, because JLink isn't ready for the next 'session'. |
I was more meaning that a CPU defines If it could be split between
That would maybe mean a specific target for If there is a way to know when JLink is ready again, we could add an option to the flasher that waits until it puts the board in a stable state. I would also love this for |
|
After having slept on it, and also re-check your code, I would say what you did is ok for Correct me if I am wrong @kYc0o, what we did not want is However, having a I am just not sure for the |
|
TL;DR; for me it's ok to constraint the bootloader, if there is no issues with rebooting to do an update. |
Well, this is the issue. Why is not possible to re-init the clock as many times as we want? And, why is it possible to do for Atmel, ST and Nordic Cortex-Ms? The goal is to have a system which is not affected by its "previous" state when executing jumps to the first member of the vector table (
I think this would bring much more complexity and I'm convinced there should be other ways to do it. The case of EFM might be a bit special since it uses a lot of code from the vendor, which might not be optimal. For the rest of the CPUs we provide most of the code, especially the clock_init, which is the problem here. So maybe we need to do an exception here but I'd prefer to not make a painful distinction between bootloader and applications, both IMHO should safely initialise the CPU.
Normal reboot is not affected, it makes a full reset of registers and clocks. |
I definitely think it is fixable. I haven't spend more than a few minutes to discover the workaround to disable it worked :-) That said, I do think it a minimal case would be desired: why spend a lot of time (and energy) in the bootloader to setup the CPU clocks, when you want to boot a slot as fast as possible? When the EFM32 boots, it is in a mode that works. I would consider it an optimization. |
|
@kaspar030 closing this in favor of #11818 |
Contribution description
This is a shot at providing a
secureauthenticated over-the-air firmware update mechanism based on tweetnacl and CoAP.Please see the included documentation in "sys/include/firmware.h" for more details including a description of the architecture.
This PR includes work from @kYc0o.
Currently only configured for and tested on samr21-xpro.
Issues/PRs references
Waiting for
#8772and#8788for the necessary CoAP infrastructure.Supersedes #7389, #7396, #7457.