Skip to content

Add handling for Flush to server, fix locking.#109

Merged
robshakir merged 3 commits intomainfrom
flush-2
Feb 2, 2022
Merged

Add handling for Flush to server, fix locking.#109
robshakir merged 3 commits intomainfrom
flush-2

Conversation

@robshakir
Copy link
Copy Markdown
Member

  * (M) rib/rib.go
  * (M) rib/rib_test.go
    - Ensure that the DeleteXXX functions handle locking correctly, such
      that they always hold a lock when accessing the map.
    - Add lockless deleteXXX functions that can be called internally
      when the client has a lock on the entire RIB. This is needed in
      order to ensure that inter-client locks can be held (e.g.,
      when removing all entries with Flush).
  * (M) server/server.go
  * (M) server/server_test.go
    - Add support for Flush removing entries from the server RIB.
    - Add test coverage for Flush.

  * (M) rib/rib.go
  * (M) rib/rib_test.go
    - Ensure that the DeleteXXX functions handle locking correctly, such
      that they always hold a lock when accessing the map.
    - Add lockless deleteXXX functions that can be called internally
      when the client has a lock on the entire RIB. This is needed in
      order to ensure that inter-client locks can be held (e.g.,
      when removing all entries with Flush).
  * (M) server/server.go
  * (M) server/server_test.go
    - Add support for Flush removing entries from the server RIB.
    - Add test coverage for Flush.
Copy link
Copy Markdown
Contributor

@sthesayi sthesayi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would admit that I haven't done a deep review of the use of locks. If you want a deep review please let me know.

@robshakir
Copy link
Copy Markdown
Member Author

Thanks @sthesayi -- I think this wasn't so much a bug in the locking, just an assumption that wasn't quite right.

Essentially, Flush says to remove all entries - and to do that we want some way to be able to make sure that someone else doesn't add something during the Flush being run. The DeleteXXX methods that we have today had fine-grained locks that were used to retrieve the current entry (added a read-lock which was missing) and then remove it (write-lock). This isn't the right behaviour for Flush.

To that end, I refactored Flush to use a delete-but-I-already-hold-the-lock method - this meant that we could take the RIB lock at the beginning of the flush call, and then use that lock throughout (avoiding the deadlock of running both).

I don't think there's anything different about the lock design here that sounds too concerning -- but I'm more than happy to talk it through. This RIB-level lock is the simplest implementation - and seems to be working fairly well thus far, but I'm sure at some point there might be a performance requirement to think through it more.

Let me know if you're concerned about anything above, and we can review it further/redesign.

@sthesayi
Copy link
Copy Markdown
Contributor

sthesayi commented Feb 1, 2022

This explanation is great and I gathered the same during the review. I agree that the RIB level lock is the simplest and safe one to think of and this can be revisited when a performance issue shows up. I would presume a Flush is a relatively rare operation, such as executing during startup, etc edge cases. In all other normal cases, individual deletes are more likely.

LGTM

  * (M) client/client.go
  * (M) client/client_test.go
    - Add support for the Flush RPC to be issued by a client.
  * (M) server/server.go
    - Add support for the FakeServer injecting an election ID to allow
      for testing of SINGLE_PRIMARY cases outside of setting up the
      server.
* (M) client/client.go
    - Modify return value to allow the client to receive the gRPC status
      error.
  * (M) compliance/compliance.go
  * (A) compliance/flush.go
    - Add new test cases for the Flush RPC.
  * (M) fluent/fluent.go
    - Add Flush fluent API.
@robshakir robshakir merged commit a9b2cba into main Feb 2, 2022
@robshakir robshakir deleted the flush-2 branch February 2, 2022 03:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants