-
Notifications
You must be signed in to change notification settings - Fork 418
auto-dispose, reconnects, readcdr, static analysis, refactoring #512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This was referenced May 13, 2020
Contributor
Author
|
Thanks @MarcelJordense |
* Add a flag to indicate signed integral values and one to indicate floating-point values * Set these flags in the output of idlc * Use them when printing sample contents to the trace By encoding the information as flags in reserved bits the actual serialization and deserialization is unaffected. Signed-off-by: Erik Boasson <[email protected]>
Scanning all instances was never good for anything: the RHC is organised as hash table on instance id (which is an alias for "instance handle") and it was always designed to do this with a fast lookup. Signed-off-by: Erik Boasson <[email protected]>
Signed-off-by: Erik Boasson <[email protected]>
Signed-off-by: Erik Boasson <[email protected]>
Do not pass a dangling pointer to update_conditions_locked after dropping an instance. The dangling pointer did not actually get dereferenced because of the state changes caused by dropping the samples, but that is cutting a bit fine. Signed-off-by: Erik Boasson <[email protected]>
Deadline registration, renewal and deregistration was somewhat spread through the code and relied on the "isdisposed" flag as a proxy for whether it was registered or not. This consolidates the deadline handling code in a final step of updating the instance and uses a separate flag to track whether the instance is currently registered in the deadline administration or not. This also makes it possible to trivially change the rules for when deadline notifications are required, and so allows for, e.g., adding a mode in which instances in the "no writers" state do not trigger any deadline missed notifications, or just once (both of which seem useful modes). Signed-off-by: Erik Boasson <[email protected]>
This changes the behaviour of auto-dispose writers: instead of always disposing when the writer disposes the data, it now only disposes the data when the instance would otherwise go to the "no writers" state. This only affects the behaviour when there are multiple writers for the same instance. In case the writers use a different value for the auto-dispose setting, it now tracks whether an instance has ever been touched by an writer with auto-dispose enabled, and treats auto-disposes the instance when the last writer leaves if this is the case. This way, if an instance is registered by one auto-dispose and one non-auto-dispose writer, the order of unregistering does not matter. Signed-off-by: Erik Boasson <[email protected]>
The entire point of this test program is to exercise the RHC while checking its internal state. The likelihood of (at least some) forgetting to enable the "expensive" checks has been proven to be significant. Signed-off-by: Erik Boasson <[email protected]>
Use of an auto-dispose writer meant the NO_WRITERS case did not actually get tested. The behaviour of the implementation was to generate deadline missed notifications for such instances, but the test expected otherwise. There is a disagreement between different DDS implementations on the desirability of generating deadline missed notifications for NOT_ALIVE instances. Deadline notifications on DISPOSED instances seems silly, as it means end-of-life. Deadline notifications on a NO_WRITERS instance are certainly valuable for applications that don't pay attention to the number of writers (otherwise one has to monitor both liveliness changed and deadline missed notifications to be be sure to get some notification). Different usage patterns definitely affect what is desirable and I doubt one-size-fits-all is the right approach. This commit changes the test and retains the behaviour, and if it errs, it at least errs on the side of caution. Signed-off-by: Erik Boasson <[email protected]>
Disposing an instance would only add an invalid sample if the instance is empty, but it should also do so when the latest sample is read. Otherwise reading all NOT_READ samples gives nothing or nonsensical output. Signed-off-by: Erik Boasson <[email protected]>
Signed-off-by: Erik Boasson <[email protected]>
The standard defines GUIDs as an array of 16 uint8_t and so they are presented on the network and in built-in topic samples. Internally they are arrays of 4 uint32_t, requiring byte-order conversion. A keyhash is also an array of 16 uint8_t, and used almost exclusively on the network. The exception is the generation of built-in topic samples, which relies on the "from_keyhash" function. One would expect the keyhash here to contain the GUID in the external representation, and this commit adds the byte-order conversions to conform to the expectation. Signed-off-by: Erik Boasson <[email protected]>
One cannot create writers for built-in topics, therefore one generally does not create samples for them. However, one can lookup an instance handle from a sample with just a key value in it, and so the function is needed. Signed-off-by: Erik Boasson <[email protected]>
A few failures to signal DATA_AVAILABLE (as well as some where it was signalled unnecessarily) were discovered while refactoring the RHC despite the tests all passing. Clearly the tests were inadequate. The enormous amount of boilerplate in the tests prompted a small rewrite to a programmable listener invocation tester that one simply feeds a noise-like one-liner in a string. This trades the boilerplate for somewhat inscrutable code. Signed-off-by: Erik Boasson <[email protected]>
Coverity has difficulty observering that dds_entity_pin / ddsrt_mutex_lock / dds_entity_unlock is correct. It is perhaps a bit confusing, so change it. Signed-off-by: Erik Boasson <[email protected]>
The security plugins currently use the standardized representations of octet sequences, unlike the DDSI stack's internal representation. A shallow copy is therefore not simply a memcpy. Signed-off-by: Erik Boasson <[email protected]>
Signed-off-by: Erik Boasson <[email protected]>
Sending a heartbeat to all matched readers for the P2P builtin participant volatile secure writer unlocks the writer before pushing each individual message out, and so determining the time of the next heartbeat event before writing and updating it afterwards means the state may have changed. While this is appears benign, it is better to do the update atomically. Signed-off-by: Erik Boasson <[email protected]>
Signed-off-by: Erik Boasson <[email protected]>
Signed-off-by: Erik Boasson <[email protected]>
The memory allocation in deserializing property lists within the crypto code should not trust the deserialized length and try to allocate that much memory but should first verify that the length is consistent with the number of bytes remaining in the input. (Noted by Coverity as use of tainted data.) Signed-off-by: Erik Boasson <[email protected]>
Signed-off-by: Erik Boasson <[email protected]>
Signed-off-by: Erik Boasson <[email protected]>
This leaves the argument pointer in the destination unchanged, rather than resetting it to an irrelevant value. Signed-off-by: Erik Boasson <[email protected]>
The code for executing one-liner tests might be more generally useful. Signed-off-by: Erik Boasson <[email protected]>
In particular, this means instances published by a transient-local writer will go back to ALIVE following a disconnect and reconnect. Signed-off-by: Erik Boasson <[email protected]>
Signed-off-by: Erik Boasson <[email protected]>
Signed-off-by: Erik Boasson <[email protected]>
Signed-off-by: Erik Boasson <[email protected]>
Signed-off-by: Erik Boasson <[email protected]>
The dds_wait_for_acks function follows the DCPS specification and allows waiting for all matching readers to have acknowledged all data written prior to that point. This commit leaves the API unchanged but extends the implementation to make it possible to wait until a specific reader has acknowledged everything, as this is a useful device in testing with deliberate one-way disconnections using dds_domain_set_deafmute. Signed-off-by: Erik Boasson <[email protected]>
Signed-off-by: Erik Boasson <[email protected]>
When defining a new topic, typically the serializer instructions that are usually in constant memory and generated by the IDL compiler are copied into memory managed by the Cyclone implementation. For this it needs to compute the size of the serializer, which the IDL compiler doesn't provide. It does this by effectively dry-running the program. (Note that it doesn't validate the program.) All but the JSR operations move the program counter forward, but the JSR operation can cause it to go backward instead and allows implementing recursive types (the IDL compiler doesn't support them, but one might decide to work around that limitation). When dry-running the program, following a backwards jump can cause a non-terminating loop. The jump could potentially be to an unexplored address and so ignoring all backwards jumps potentially means it skips part of the program. As this is not a validator and the program can always be arranged so that a following a backwards jump is not relevant to computing the size correctly, this is reasonable approximation. Signed-off-by: Erik Boasson <[email protected]>
* Compute the time at which the handshake must have completed from the initial timeout specification, rather than using it as a timeout for the individual steps of the handshake * If the handshake fails because an expected message is not present, print this, including whether the timeout occured because the message queue was empty or because the expected message could not be found in a non-empty queue. * Replace the 0.5s per-step timeout to a single 2s timeout. Signed-off-by: Erik Boasson <[email protected]>
Signed-off-by: Erik Boasson <[email protected]>
* Increase the matching timeout to 5s (there are some hints the failures on Travis are timing related) * Replace the relative timeout in the waitset by a timestamp so that it gives up after the specified timeout regardless of the number of events that occur Signed-off-by: Erik Boasson <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is terribly large PR and not at all following the general rule that one should not mix unrelated things in a single PR. It is rather bad to not give a good example, but I have nonetheless decided to do it this way:
I welcome anyone commenting on it. I'd recommend looking at individual commits ... In particular I'd be much obliged if @MarcelJordense could find some time to look at: d2ca446, 9579461, cacd39a and 50fa943; and if
@dennis-adlink would find time to look at: b0a19ef, 816d016 because I think they know that code well enough to do meaningful review without taking an inordinate amount of time.
What follows is a summary:
Part the First
In which #499 and #505 are moved over to the security branch. They are very nearly identical: a few types have been given a better name, a few
unsigneds got changed touint32_ts and a complete mess-up in one of the intermediate commits in #499 such that it wouldn’t even compile. For those who like to read diffs of diffs comparing them is easy enough ...In short, from #499:
dds_read_instance(and similar)dds_readcdr(this affects the RHC plugin interface)and #505:
NOT_ALIVE_NO_WRITERSstatedds_lookup_instanceon a built-in topic work instead of crash because of a missing implementation of thefrom_sampleoperation …(This part ends at "Rework listener tests" 24ccb39)
Part the Second
In which the security branch went for looking cover. It was there that a few mouldy spots were discovered. A number of them could have developed into real problems so it is a good thing that this happened. That’s:
Part the Third
In which the reworked listener tests were subjected to a bit of further rework: it makes vastly more sense to have an init/step/fini set of functions so that this little monster can be paired with custom code. No doubt there is more work to be done (e.g., turn the individual operations into nice functions so you can use them in more contexts, some of them might come in useful).
This then led to a few more things:
ALIVEwhen a sample is rejected (too old, doesn’t pass the filter), which means at least transient-local data will always go back toALIVEafter a reconnect.dds_merge_listenersfunction changed from copying each listener from the source to the destination if and only if the destination is clear (for that listener) and the source is set (for that listener). Before, it’d copy it if the destination was clear, regardless of whether the source was set. The subtle difference in lies in the listener arguments, and I think this makes more sense. (Need to add set/get stuff that includes the argument …)DDS_STATUS_ID_MAXgot added to the interface that makes life easier for anyone wanting to range over all status IDs.dds_wait_for_acksis now capable of waiting for acknowledgements from a specific reader, though this is not currently exposed in the interface. It’s worth considering, but there needs to be a compelling reason to do so, I’d say.Part the Fourth
In which Clang’s static analyzer in its latest incarnation picked up a few dead stores (all eliminated) and GCC 10’s complained about a few things more. This does not yet eliminate all warnings from GCC 10.