Restore named_sub map entries ordering when reading from a goto binary.#7534
Conversation
Codecov ReportBase: 78.49% // Head: 78.49% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## develop #7534 +/- ##
===========================================
- Coverage 78.49% 78.49% -0.01%
===========================================
Files 1663 1667 +4
Lines 191336 191437 +101
===========================================
+ Hits 150182 150261 +79
- Misses 41154 41176 +22
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
We need to collect some numbers, perhaps on the GOTO binaries produced by Kani for those tend to be pretty large. I'd suggest to measure the time that |
|
We do not wish to guarantee an ordering of the named_sub_map entries. |
The current implementation of The alternatives to sorting on file read, I can think of would include -
Option 1 is just moving the requirement for the collection to be ordered around. Option 2 potentially moves the performance costs around and doesn't deal with any further hidden requirements on order in code using |
d08274e to
b905c4f
Compare
|
Could you please add an |
|
Which ever solution we come to for this issue, it would be good to have a unit test on reading unordered named_subs. That would mitigate the risk of any future optimisation efforts reintroducing the same issue. |
I can provide a goto binary that does not satisfy the implicit ordering requirement. Would that be ok ? |
This is indeed better |
If so we’d need a way to build such a binary: as GOTO binary versions evolve we’ll still need to be able to run this test. |
A unit test could potentially be written to cover |
b905c4f to
c52c0f6
Compare
|
Update: I implemented @tautschnig suggestion by adding an I also renamed Last I changed to using The only place that still assumes an ordering for the The |
thomasspriggs
left a comment
There was a problem hiding this comment.
I am approving as I believe the main files affected by this PR, for which I am code owner are the SMT backend and that adding in the missing include of <algorithm> is a perfectly reasonable change to make for these.
I would prefer to see a test of some kind included before this PR is merged. I have the feeling that better performance characteristics for loading may be achieved using an emplace_hint style interface. But without the test we can't be certain that potential optimisations do not break your fix.
martin-cs
left a comment
There was a problem hiding this comment.
A belated approve of adding the headers. I'm not quite sure I understand why they are needed but they are not offensive.
|
|
Thanks of the explanation @thomasspriggs . |
The
named_subtis a map fromirep_idtkeys toireptvalues implemented using a forward list. Entries must be stored in increasing key order in the list, otherwise lookups will fail. Theirep_idtordering is defined by how string interning numbers strings in CBMC.When a goto binary is produced by a different tool the
named_subordering in the binary file may not necessarily match the one expected by CBMC. We useirept::named_subt::addinstead ofirept::named_subt::emplace_afterto build thenamed_subwhen deserialising so that so that proper ordering is guaranteed.If the performance impact is considered acceptable, this PR would make the binary format independent of the internal numbering of strings used in CBMC, which in turn will allow us to produce goto binaries directly from Kani instead of using the JSON + symtab2gb path.
I measured the performance impact on
read_bin_goto_objecton a collection of goto binaries in ranging in size from a hundreds of kb to 8Mb. Overall reordering adds a measurable 2-5% overhead onread_bin_goto_object.For kb-sized goto binaries the overhead is ~1 millisecond
For the 8Mb binary the baseline is:
with reordering we see an extra ~8ms spent on loading the binary.