Skip to content

Conversation

@bharathv
Copy link
Contributor

@bharathv bharathv commented Nov 1, 2019

Generally when a HBaseClient tries to create a cluster
Connection, it fetches a bunch of metadata from Zookeeper
(like active master address, clusterID, meta locations etc)
before it creates the underlying transport. However exposing
ZK to all the clients is a DDOS risk and ZK connections in
the past have caused issues by not timing out on blocking
RPCs (more context in the JIRA).

This patch attempts to remove this ZK dependency by making
the client fetch all the meta information directly from list
of client configured HMaster endpoints. The patch adds a
a new AsyncRegistry implementation that encapsulates this logic
of fetching this meta information from the provided master
end points. New RPCs are added to the HMasters to help fetch
this information.

Meta HRL caching:

One critical piece of metadata needed by clients to query tables
is meta HRegionLocations. These are fetched from ZK by default.
Since this patch moves away from ZK, it adds an in-memory cache
of these locations on both Active/StandBy HMasters. ZK Listeners
are registered to keep the cache up-to-date.

New client configs:

  • 'hbase.client.asyncregistry.masteraddrs' Should be set to a
    list of comma separated HMaster host:port addresses.
  • Should be used in conjunction with 'hbase.client.registry.impl'
    set to HMasterAsyncRegistry class.
  • Testing is still a WIP. Error paths are missing (ex: a master is
    not accessible etc).

@bharathv
Copy link
Contributor Author

bharathv commented Nov 1, 2019

@apurtell / @wchevreuil FYI. Appreciate if you can give any initial feedback.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
💙 reexec 0m 44s Docker mode activated.
_ Prechecks _
💚 dupname 0m 1s No case conflicting files found.
💙 prototool 0m 0s prototool was not available.
💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
💚 @author 0m 0s The patch does not contain any @author tags.
💚 test4tests 0m 0s The patch appears to include 6 new or modified test files.
_ master Compile Tests _
💙 mvndep 0m 39s Maven dependency ordering for branch
💚 mvninstall 6m 40s master passed
💚 compile 2m 10s master passed
💚 checkstyle 2m 18s master passed
💚 shadedjars 4m 59s branch has no errors when building our shaded downstream artifacts.
💚 javadoc 1m 17s master passed
💙 spotbugs 4m 57s Used deprecated FindBugs config; considering switching to SpotBugs.
💚 findbugs 8m 46s master passed
_ Patch Compile Tests _
💙 mvndep 0m 16s Maven dependency ordering for patch
💚 mvninstall 5m 20s the patch passed
💚 compile 2m 10s the patch passed
💚 cc 2m 10s the patch passed
💚 javac 2m 10s the patch passed
💔 checkstyle 0m 41s hbase-client: The patch generated 145 new + 206 unchanged - 0 fixed = 351 total (was 206)
💔 checkstyle 1m 35s hbase-server: The patch generated 320 new + 322 unchanged - 0 fixed = 642 total (was 322)
💚 whitespace 0m 0s The patch has no whitespace issues.
💚 shadedjars 4m 42s patch has no errors when building our shaded downstream artifacts.
💚 hadoopcheck 15m 56s Patch does not cause any errors with Hadoop 2.8.5 2.9.2 or 3.1.2.
💚 hbaseprotoc 2m 3s the patch passed
💚 javadoc 1m 15s the patch passed
💔 findbugs 4m 24s hbase-server generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0)
_ Other Tests _
💚 unit 0m 44s hbase-protocol-shaded in the patch passed.
💚 unit 1m 53s hbase-client in the patch passed.
💔 unit 162m 37s hbase-server in the patch failed.
💚 asflicense 1m 25s The patch does not generate ASF License warnings.
244m 19s
Reason Tests
FindBugs module:hbase-server
Dead store to isActiveMaster in org.apache.hadoop.hbase.master.MasterRpcServices.isActive(RpcController, MasterProtos$IsActiveRequest) At MasterRpcServices.java:org.apache.hadoop.hbase.master.MasterRpcServices.isActive(RpcController, MasterProtos$IsActiveRequest) At MasterRpcServices.java:[line 1006]
Return value of org.apache.hadoop.hbase.zookeeper.ZKWatcher.getZNodePaths() ignored, but method has no side effect At MetaRegionLocationCache.java:but method has no side effect At MetaRegionLocationCache.java:[line 83]
Failed junit tests hadoop.hbase.client.TestHMasterAsyncRegistry
Subsystem Report/Notes
Docker Client=19.03.4 Server=19.03.4 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-781/1/artifact/out/Dockerfile
GITHUB PR #781
JIRA Issue HBASE-18095
Optional Tests dupname asflicense javac javadoc unit spotbugs findbugs shadedjars hadoopcheck hbaseanti checkstyle compile cc hbaseprotoc prototool
uname Linux 43004a648844 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 GNU/Linux
Build tool maven
Personality /home/jenkins/jenkins-slave/workspace/HBase-PreCommit-GitHub-PR_PR-781/out/precommit/personality/provided.sh
git revision master / ea5c572
Default Java 1.8.0_181
checkstyle https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-781/1/artifact/out/diff-checkstyle-hbase-client.txt
checkstyle https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-781/1/artifact/out/diff-checkstyle-hbase-server.txt
findbugs https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-781/1/artifact/out/new-findbugs-hbase-server.html
unit https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-781/1/artifact/out/patch-unit-hbase-server.txt
Test Results https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-781/1/testReport/
Max. process+thread count 4919 (vs. ulimit of 10000)
modules C: hbase-protocol-shaded hbase-client hbase-server U: .
Console output https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-781/1/console
versions git=2.11.0 maven=2018-06-17T18:33:14Z) findbugs=3.1.11
Powered by Apache Yetus 0.11.0 https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@wchevreuil wchevreuil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for woking on this @bharathv ! Overall, looks good, just minor comments. Seems the newly added test failed the qabot run, can you review that together with the checkstyle and findbug messages?

}

public static HBaseProtos.RegionLocation toRegionLocation(HRegionLocation loc) {
if (loc == null) return null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just curious, did you face NPE conditions while testing this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, but I think it could be null with my patch in error cases where HMaster is not accessible (referring to MetaRegionLocationCache#updateMetaLocation()).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is a code smell. Better the caller should filter null locations, or that they don't happen at all (see my comments in MetaRegionLocationCache re: storing null values in the cache.

This method is also called by ReopenTableRegionsProcedure, so I am curious if there's a test that can exercise the null value.

// Tracks any other connections created with custom client config. Used for testing clients with custom
// configurations. Tracked here so that they can be cleaned up on close() / restart.
private List<AsyncClusterConnection> customConnections = Collections.synchronizedList(new ArrayList<>());

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't quite follow why you needed to define these extra connections. Are you using both types of registry implementation on a single util instance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this because I want to use different server and client side configs. The default Connection object built (in initConnection()) is using the server side conf. We cannot use the same conf for this patch, because we need the service side meta/master tracking to happen using ZK and just the client side to use HMasterAsyncRegistry. Is there a cleaner way to do this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because we need the service side meta/master tracking to happen using ZK and just the client side to use HMasterAsyncRegistry

Oh, didn't know that was the intention. I thought it was just fine to leave even server side processes acting as client to use the master registry. Is the concern here related to an rpc overload?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the same question. Only the masters need to do things differently. All other clients, including embedded server side clients, should use the new service? So we get a nice common interposition point between all clients and the source of truth. Except the masters themselves, of course, which need to go directly to the source of truth.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@apurtell Agree, that makes the whole design simpler. That is my end goal.

Comment on lines 65 to 66
conf.set(HMasterAsyncRegistry.CONF_KEY, HostAndPort.fromParts(masterHostName, masterPort).toString());
conf.set(AsyncRegistryFactory.REGISTRY_IMPL_CONF_KEY, HMasterAsyncRegistry.class.getName());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be set on TEST_UTIL config before starting the cluster, so that its async connection impl already uses HMasterAsyncRegistry?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above, the reason I didn't do that is because that enables this registry on Region servers too. This is purely a client side config. Let me know if you know of a cleaner way to do this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is purely a client side config

Yeah, that confused me. Am just wandering now what would be the impacts of having it also set at server side? Because that may become a common config mistake, if it's critical, maybe it's worth start thinking about split the connection creation path from client and server side?

zk.getZNodePaths().getMetaReplicaIdFromZnode(input);
fail("Exception not hit getMetaReplicaIdFromZnode(): " + input);
} catch (NumberFormatException e) {
// Expected
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: may be we can use assertEquals with NFE's error message if available?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that will be tricky and probably needs more plumbing and not worth it I guess.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine as it is, or if you prefer you can use mockito's expected functionality:

@Test(expected = NumberFormatException.class)

final class AsyncRegistryFactory {

static final String REGISTRY_IMPL_CONF_KEY = "hbase.client.registry.impl";
public static final String REGISTRY_IMPL_CONF_KEY = "hbase.client.registry.impl";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO unless some source code requires this to be public, we can keep this unchanged. For test case, we can directly use "hbase.client.registry.impl". Should be fine?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replied in another comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should keep it as a public constant, as per @bharathv explanation on the other comment.

Copy link
Contributor

@apurtell apurtell Nov 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the idea here is the clients will instantiate one type of REGISTRY_IMPL, which contacts masters via RPC, and the masters will instantiate another type of REGISTRY_IMPL that talks to zookeeper directly? How is the distinction managed? Maybe I'll answer my own question upon further review...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same concern here. My understanding now, is that both server and client will rely on same config property, which may seem error prone to me. That's why I mention on my previous comment that maybe we should think on split the connection creation path between server and client based callers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with the comments. Like @apurtell mentioned in another comment, I will work towards making this the default registry implementation for both clients and internal service connections (except Master which obviously goes to ZK, the source of truth). There is nothing in the implementation that prevents this. It was only done because the HBaseMiniCluster used in tests picks random ports (for running concurrent tests) and the clients don't know before hand what would the correct master port to use in the config.

So if you see the pattern in tests, we wait for the mini cluster to be up, get the running master and it's port and then create a new Connection object based on that config. Once I figure out a way to force the mini-cluster to use certain known ports (without affecting the test concurrency ofcourse), we can get rid of the whole custom-config / split-config business. I'm looking into it. Hope it clarifies the intention.

@InterfaceAudience.Private
public class HMasterAsyncRegistry implements AsyncRegistry {
private static final Logger LOG = LoggerFactory.getLogger(ZKAsyncRegistry.class);
public static final String CONF_KEY = "hbase.client.asyncregistry.masteraddrs";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, good to keep it private unless source code needs it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably already noticed, they are referred to in tests. Since these are "static final", they are immutable and read only. If we end up making it private and make copies of the string in tests, it is very difficult to refactor if one wants to change the config key (you'll have to grep + replace all the occurrences). I could be wrong but I see this pattern elsewhere in this project. (git grep "public static final" | egrep -v "Test|generated")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, let's keep this constant public.

private static final Logger LOG = LoggerFactory.getLogger(MetaRegionLocationCache.class);

// Maximum number of times we retry when ZK operation times out. Should this be configurable?
private static final int MAX_ZK_META_FETCH_RETRIES = 10;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think better to make it configurable and may be keep 10(not sure about right value) as default in the absence of site config.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think it really needs to be configured. It if really comes to that point, something is pretty badly screwed up. Let me think more about this and get back.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, the control of no of retries is not supposed to be given to user?

Generally when an HBaseClient tries to create a cluster
Connection, it fetches a bunch of metadata from Zookeeper
(like active master address, clusterID, meta locations etc)
before it creates the underlying transport. However exposing
ZK to all the clients is a DDOS risk and ZK connections in
the past have caused issues by not timing out on blocking
RPCs (more context in the JIRA).

This patch attempts to remove this ZK dependency by making
the client fetch all the meta information directly from list
of client configured HMaster endpoints. The patch adds a
a new AsyncRegistry implementation that encapsulates this logic
of fetching this meta information from the provided master
end points. New RPCs are added to the HMasters to help fetch
this information.

Meta HRL caching:
----------------
One critical piece of metadata needed by clients to query tables
is meta HRegionLocations. These are fetched from ZK by default.
Since this patch moves away from ZK, it adds an in-memory cache
of these locations on both Active/StandBy HMasters. ZK Listeners
are registered to keep the cache up-to-date.

New client configs:
------------------
- 'hbase.client.asyncregistry.masteraddrs' Should be set to a
list of comma separated HMaster host:port addresses.
- Should be used in conjunction with 'hbase.client.registry.impl'
set to HMasterAsyncRegistry class.
Copy link
Contributor Author

@bharathv bharathv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking a look. Addressed/answered some comments/questions. Added more test coverage. Please let me know what you think.

@InterfaceAudience.Private
public class HMasterAsyncRegistry implements AsyncRegistry {
private static final Logger LOG = LoggerFactory.getLogger(ZKAsyncRegistry.class);
public static final String CONF_KEY = "hbase.client.asyncregistry.masteraddrs";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably already noticed, they are referred to in tests. Since these are "static final", they are immutable and read only. If we end up making it private and make copies of the string in tests, it is very difficult to refactor if one wants to change the config key (you'll have to grep + replace all the occurrences). I could be wrong but I see this pattern elsewhere in this project. (git grep "public static final" | egrep -v "Test|generated")

}

public static HBaseProtos.RegionLocation toRegionLocation(HRegionLocation loc) {
if (loc == null) return null;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, but I think it could be null with my patch in error cases where HMaster is not accessible (referring to MetaRegionLocationCache#updateMetaLocation()).

// Tracks any other connections created with custom client config. Used for testing clients with custom
// configurations. Tracked here so that they can be cleaned up on close() / restart.
private List<AsyncClusterConnection> customConnections = Collections.synchronizedList(new ArrayList<>());

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this because I want to use different server and client side configs. The default Connection object built (in initConnection()) is using the server side conf. We cannot use the same conf for this patch, because we need the service side meta/master tracking to happen using ZK and just the client side to use HMasterAsyncRegistry. Is there a cleaner way to do this?

Comment on lines 65 to 66
conf.set(HMasterAsyncRegistry.CONF_KEY, HostAndPort.fromParts(masterHostName, masterPort).toString());
conf.set(AsyncRegistryFactory.REGISTRY_IMPL_CONF_KEY, HMasterAsyncRegistry.class.getName());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above, the reason I didn't do that is because that enables this registry on Region servers too. This is purely a client side config. Let me know if you know of a cleaner way to do this.

zk.getZNodePaths().getMetaReplicaIdFromZnode(input);
fail("Exception not hit getMetaReplicaIdFromZnode(): " + input);
} catch (NumberFormatException e) {
// Expected
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that will be tricky and probably needs more plumbing and not worth it I guess.

private static final Logger LOG = LoggerFactory.getLogger(MetaRegionLocationCache.class);

// Maximum number of times we retry when ZK operation times out. Should this be configurable?
private static final int MAX_ZK_META_FETCH_RETRIES = 10;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think it really needs to be configured. It if really comes to that point, something is pretty badly screwed up. Let me think more about this and get back.

final class AsyncRegistryFactory {

static final String REGISTRY_IMPL_CONF_KEY = "hbase.client.registry.impl";
public static final String REGISTRY_IMPL_CONF_KEY = "hbase.client.registry.impl";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replied in another comment.

@bharathv
Copy link
Contributor Author

bharathv commented Nov 4, 2019

I configured the checkstyle plugin and fixed most of the checkstyle issues and messy indents. Still figuring out the import order mess. For some reason it is never happy, whatever way I fix the imports.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
💙 reexec 0m 35s Docker mode activated.
_ Prechecks _
💚 dupname 0m 0s No case conflicting files found.
💙 prototool 0m 1s prototool was not available.
💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
💚 @author 0m 0s The patch does not contain any @author tags.
💚 test4tests 0m 0s The patch appears to include 8 new or modified test files.
_ master Compile Tests _
💙 mvndep 0m 35s Maven dependency ordering for branch
💚 mvninstall 5m 31s master passed
💚 compile 2m 3s master passed
💚 checkstyle 2m 22s master passed
💚 shadedjars 4m 38s branch has no errors when building our shaded downstream artifacts.
💚 javadoc 1m 16s master passed
💙 spotbugs 4m 2s Used deprecated FindBugs config; considering switching to SpotBugs.
💚 findbugs 7m 39s master passed
_ Patch Compile Tests _
💙 mvndep 0m 14s Maven dependency ordering for patch
💚 mvninstall 4m 54s the patch passed
💚 compile 2m 2s the patch passed
💚 cc 2m 2s the patch passed
💚 javac 2m 2s the patch passed
💔 checkstyle 0m 38s hbase-client: The patch generated 10 new + 206 unchanged - 0 fixed = 216 total (was 206)
💔 checkstyle 1m 26s hbase-server: The patch generated 27 new + 322 unchanged - 0 fixed = 349 total (was 322)
💚 whitespace 0m 0s The patch has no whitespace issues.
💚 shadedjars 4m 49s patch has no errors when building our shaded downstream artifacts.
💚 hadoopcheck 15m 41s Patch does not cause any errors with Hadoop 2.8.5 2.9.2 or 3.1.2.
💚 hbaseprotoc 2m 1s the patch passed
💚 javadoc 1m 12s the patch passed
💚 findbugs 8m 15s the patch passed
_ Other Tests _
💚 unit 0m 42s hbase-protocol-shaded in the patch passed.
💚 unit 1m 54s hbase-client in the patch passed.
💚 unit 161m 39s hbase-server in the patch passed.
💚 asflicense 1m 32s The patch does not generate ASF License warnings.
238m 53s
Subsystem Report/Notes
Docker Client=19.03.4 Server=19.03.4 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-781/2/artifact/out/Dockerfile
GITHUB PR #781
JIRA Issue HBASE-18095
Optional Tests dupname asflicense javac javadoc unit spotbugs findbugs shadedjars hadoopcheck hbaseanti checkstyle compile cc hbaseprotoc prototool
uname Linux 7587f3deda10 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 GNU/Linux
Build tool maven
Personality /home/jenkins/jenkins-slave/workspace/HBase-PreCommit-GitHub-PR_PR-781/out/precommit/personality/provided.sh
git revision master / 90007b7
Default Java 1.8.0_181
checkstyle https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-781/2/artifact/out/diff-checkstyle-hbase-client.txt
checkstyle https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-781/2/artifact/out/diff-checkstyle-hbase-server.txt
Test Results https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-781/2/testReport/
Max. process+thread count 5005 (vs. ulimit of 10000)
modules C: hbase-protocol-shaded hbase-client hbase-server U: .
Console output https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-781/2/console
versions git=2.11.0 maven=2018-06-17T18:33:14Z) findbugs=3.1.11
Powered by Apache Yetus 0.11.0 https://yetus.apache.org

This message was automatically generated.

parseHortPorts();
// Passing the default cluster ID means that the token based authentication does not work for
// certain client implementations.
// TODO(bharathv): Figure out a way to fetch the CLUSTER ID using a non authenticated way.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thought from the JIRA discussion was to add a servlet which does not require authentication. So, a separate instance of the infoserver stack, simply serving the cluster ID. A bit heavyweight, but an alternative like a separate instance of the RPC stack with auth requirements disabled serving only one iface/method isn't any less heavy.

}

@VisibleForTesting
AsyncRegistry getRegistry() { return registry; }
Copy link
Contributor

@apurtell apurtell Nov 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Fine for master and branch-2, but note a headache here for branch-1 (if a backport is desired). The precursor implementation in branch-1 is ClusterRegistry. AsyncRegistry was a big refactor via HBASE-16835. Not suggesting this needs be different, and some simple substitutions may get you most of the way there. Just want to point this out.

final class AsyncRegistryFactory {

static final String REGISTRY_IMPL_CONF_KEY = "hbase.client.registry.impl";
public static final String REGISTRY_IMPL_CONF_KEY = "hbase.client.registry.impl";
Copy link
Contributor

@apurtell apurtell Nov 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the idea here is the clients will instantiate one type of REGISTRY_IMPL, which contacts masters via RPC, and the masters will instantiate another type of REGISTRY_IMPL that talks to zookeeper directly? How is the distinction managed? Maybe I'll answer my own question upon further review...

@InterfaceAudience.Private
public class HMasterAsyncRegistry implements AsyncRegistry {
private static final Logger LOG = LoggerFactory.getLogger(HMasterAsyncRegistry.class);
public static final String CONF_KEY = "hbase.client.asyncregistry.masteraddrs";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This doesn't need "asynchregistry" in the key name. We can anticipate multiple consumers of "hbase.client.master.addrs" (suggestion): HMasterAsyncRegistry, HMasterClusterRegistry (in branch-1), something else...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or even "hbase.master.addrs"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you make this CONF_KEY more specific/ meaningful?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we're nit-picking, I prefer to see all public static grouped separate from private static, and the public interface coming earlier in the file.

masterServers.add(ServerName.valueOf(hostPort, ServerName.NON_STARTCODE));
}
Preconditions.checkArgument(!masterServers.isEmpty(), String.format("%s is empty", CONF_KEY));
// Randomize so that not every client sends requests in the same order.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good

* master found.
*/
private String getClusterIdHelper() throws MasterNotRunningException {
// Loop through all the masters serially. We could be hitting some standby masters which cannot
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth asking: Why do we need an active master to tell us what the cluster ID is? All the masters can look at the file in HDFS regardless of role. Standbys aren't special here, they can only serve one cluster, just like the active. If we fix this so any master can respond to this query then it's a small improvement in overall availability.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats a good point. I'm looking into it.

throws ServiceException {
GetClusterIdResponse.Builder response = GetClusterIdResponse.newBuilder();
try {
master.checkInitialized();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my above comment. Initialize just this bit of info from HDFS if needed. Does not require active role.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack.

// Tracks any other connections created with custom client config. Used for testing clients with custom
// configurations. Tracked here so that they can be cleaned up on close() / restart.
private List<AsyncClusterConnection> customConnections = Collections.synchronizedList(new ArrayList<>());

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the same question. Only the masters need to do things differently. All other clients, including embedded server side clients, should use the new service? So we get a nice common interposition point between all clients and the source of truth. Except the masters themselves, of course, which need to go directly to the source of truth.

* helper should only used if one wants to test a custom client side configuration that differs from the conf used to
* spawn the mini-cluster.
*/
public AsyncClusterConnection getCustomConnection(Configuration conf) throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is confusing.
"Custom"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will try to get rid of this.

hostAndPorts.add(HostAndPort.fromParts(masterHostName, masterPort));
final String config = Joiner.on(",").join(hostAndPorts);
conf.set(HMasterAsyncRegistry.CONF_KEY, config);
conf.set(AsyncRegistryFactory.REGISTRY_IMPL_CONF_KEY, HMasterAsyncRegistry.class.getName());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I see where this is set for testing, but not where it's established as a default for clients.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should somehow distinguish between master and not-master roles and choose the right one, allowing for custom config override.

Copy link
Contributor Author

@bharathv bharathv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just flushing out some design comments.

final class AsyncRegistryFactory {

static final String REGISTRY_IMPL_CONF_KEY = "hbase.client.registry.impl";
public static final String REGISTRY_IMPL_CONF_KEY = "hbase.client.registry.impl";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with the comments. Like @apurtell mentioned in another comment, I will work towards making this the default registry implementation for both clients and internal service connections (except Master which obviously goes to ZK, the source of truth). There is nothing in the implementation that prevents this. It was only done because the HBaseMiniCluster used in tests picks random ports (for running concurrent tests) and the clients don't know before hand what would the correct master port to use in the config.

So if you see the pattern in tests, we wait for the mini cluster to be up, get the running master and it's port and then create a new Connection object based on that config. Once I figure out a way to force the mini-cluster to use certain known ports (without affecting the test concurrency ofcourse), we can get rid of the whole custom-config / split-config business. I'm looking into it. Hope it clarifies the intention.

* master found.
*/
private String getClusterIdHelper() throws MasterNotRunningException {
// Loop through all the masters serially. We could be hitting some standby masters which cannot
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats a good point. I'm looking into it.

throws ServiceException {
GetClusterIdResponse.Builder response = GetClusterIdResponse.newBuilder();
try {
master.checkInitialized();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack.

// Tracks any other connections created with custom client config. Used for testing clients with custom
// configurations. Tracked here so that they can be cleaned up on close() / restart.
private List<AsyncClusterConnection> customConnections = Collections.synchronizedList(new ArrayList<>());

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@apurtell Agree, that makes the whole design simpler. That is my end goal.

* helper should only used if one wants to test a custom client side configuration that differs from the conf used to
* spawn the mini-cluster.
*/
public AsyncClusterConnection getCustomConnection(Configuration conf) throws IOException {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will try to get rid of this.

public class HMasterAsyncRegistry implements AsyncRegistry {
private static final Logger LOG = LoggerFactory.getLogger(HMasterAsyncRegistry.class);
public static final String CONF_KEY = "hbase.client.asyncregistry.masteraddrs";
private static final String DEFAULT_HOST_PORT = "localhost:" + HConstants.DEFAULT_MASTER_PORT;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this "localhost" being default value for DEFAULT_HOST_PORT useful?

import org.slf4j.LoggerFactory;

/**
* Fetches the meta information directly from HMaster by making relevant RPCs. HMaster RPC end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: endpoint should be one word.

@InterfaceAudience.Private
public class HMasterAsyncRegistry implements AsyncRegistry {
private static final Logger LOG = LoggerFactory.getLogger(HMasterAsyncRegistry.class);
public static final String CONF_KEY = "hbase.client.asyncregistry.masteraddrs";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you make this CONF_KEY more specific/ meaningful?

public HMasterAsyncRegistry(Configuration config) {
masterServers = new ArrayList<>();
conf = config;
parseHortPorts();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo here?
HortPort?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

// certain client implementations.
// TODO(bharathv): Figure out a way to fetch the CLUSTER ID using a non authenticated way.
rpcClient = RpcClientFactory.createClient(conf, HConstants.CLUSTER_ID_DEFAULT);
rpcTimeout = (int) Math.min(Integer.MAX_VALUE,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you do toNanos here, is int good enough?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like int is used throughout our RPC call stack. It's not well documented, but I think this timeout is supposed to be milliseconds, not nanoseconds. I followed method calls and member variables back to HBaseRpcControllerImpl#callTimeout.

}
if (result == null || result.isEmpty()) {
throw new MetaRegionsNotAvailableException(String.format(
"Meta locations not found. Probed masters: %s", conf.get(CONF_KEY, DEFAULT_HOST_PORT)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:"Meta location not found"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was meant to be for multiple replicas of meta region..

}

/**
* Picks the first master entry from 'masterHortPorts' to fetch the meta region locations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo:
masterHortPorts

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, nice catch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might as well make this an {@link #methodName}.

But actually the method definition in the interface has a javadoc, so having one here doesn't add much.

GetClusterIdResponse resp =
stub.getClusterId(rpcController, GetClusterIdRequest.getDefaultInstance());
return resp.getClusterId();
} catch (IOException e) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, what's the point separating IOException and ServiceException here?
can you combine them?

populateInitialMetaLocations();
}

private void populateInitialMetaLocations() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we do exponential backoff retry here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. Should follow our existing RPC retry patterns, configurable for operators to tweak, &c. See RetryCounterFactory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah this is cool. Done.

}
RegionState state = null;
int retries = 0;
while (retries++ < MAX_ZK_META_FETCH_RETRIES) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above : should we do exponential backoff retry here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nod

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Member

@ndimiduk ndimiduk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great!

DELETED
};

public MetaRegionLocationCache(ZKWatcher zkWatcher) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be package-private.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

populateInitialMetaLocations();
}

private void populateInitialMetaLocations() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. Should follow our existing RPC retry patterns, configurable for operators to tweak, &c. See RetryCounterFactory.

}
RegionState state = null;
int retries = 0;
while (retries++ < MAX_ZK_META_FETCH_RETRIES) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nod

updateMetaLocation(path, ZNodeOpType.INIT);
}
break;
} catch (KeeperException.OperationTimeoutException e) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is OperationTimeoutException the only subclass that should be retired? What about ConnectionLossException, SessionExpiredException, or SessionMovedException? Should this class attempt to gracefully ride over these as well, or is that the responsibility of a higher power?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched the code to retry for all KeeperExceptions.

}
}
if (state == null) {
cachedMetaLocations.put(replicaId, null);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the intention behind of storing null? Probably you want to delete the entry instead. If there's a reason to know that the received state was null, better to have an explicit enum variant, UNKNOWN for example. I don't really see how that's useful to a client, but it's more informative than null.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats a fair point. I've updated the code to delete the entry.

stub.getClusterId(rpcController, GetClusterIdRequest.getDefaultInstance());
return resp.getClusterId();
} catch (IOException e) {
LOG.warn("Error fetching cluster ID from master: {}", sname, e);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment re: log level and using e.getMessage().

"Meta locations not found. Probed masters: %s", conf.get(CONF_KEY, DEFAULT_HOST_PORT)));
}
List<HRegionLocation> deserializedResult = new ArrayList<>();
result.stream().forEach(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

result.forEach().

returns(GetClusterStatusResponse);

/** Returns whether this master is active or not. Served on both active/standby masters.*/
rpc IsActive(IsActiveRequest) returns(IsActiveResponse);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is the active master identified? Is it possible that the standby masters maintain awareness of the active? If so, it should be possible to remove this boolean query RPC and instead have any master reply with the current master if it is unable to service the request.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#812 - You are already a reviewer on that :-)


import static org.apache.hadoop.hbase.client.RegionReplicaTestHelper.testLocator;

import org.apache.commons.io.IOUtils;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused import.

* Tests basic create, put, scan operations using the connection.
*/
@Test
public void testCustomConnectionBasicOps() throws Exception {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not parameterize the various TestFromClientSide* classes to use both the new and the old methods?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#807 is a blocker to parameterize this. Will do it once that is committed.

Copy link
Contributor Author

@bharathv bharathv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reviewers' convenience, I created a separate pull request #830 that implements just the master's meta location tracking. I addressed comments from @ndimiduk and @xcangCRM in that pull request. Publishing them here so that it is easy to track.

}
if (result == null || result.isEmpty()) {
throw new MetaRegionsNotAvailableException(String.format(
"Meta locations not found. Probed masters: %s", conf.get(CONF_KEY, DEFAULT_HOST_PORT)));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was meant to be for multiple replicas of meta region..

}

/**
* Picks the first master entry from 'masterHortPorts' to fetch the meta region locations.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, nice catch.

DEFAULT_HBASE_RPC_TIMEOUT)));
}

private void parseHortPorts() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

public HMasterAsyncRegistry(Configuration config) {
masterServers = new ArrayList<>();
conf = config;
parseHortPorts();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

private AssignmentManager assignmentManager;

// Cache of meta locations indexed by replicas
private MetaRegionLocationCache metaRegionLocationCache;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

updateMetaLocation(path, ZNodeOpType.INIT);
}
break;
} catch (KeeperException.OperationTimeoutException e) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched the code to retry for all KeeperExceptions.

if (!isValidMetaZNode(path)) {
return;
}
LOG.info("Meta znode for path {}: {}", path, opType.name());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched to debug. Didn't fully get what you meant in the last statement, but the intention for this logging here is to collate the timestamps in case there are issues like meta locations seen by clients are stale etc.

returns(GetClusterStatusResponse);

/** Returns whether this master is active or not. Served on both active/standby masters.*/
rpc IsActive(IsActiveRequest) returns(IsActiveResponse);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#812 - You are already a reviewer on that :-)

.get();
util.getAdmin().move(regionInfo.getEncodedNameAsBytes(), newServerName);
if (regionInfo.isMetaRegion()) {
// Invalidate the meta cache forcefully to avoid test races. Otherwise there might be a
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Undone.

* Tests basic create, put, scan operations using the connection.
*/
@Test
public void testCustomConnectionBasicOps() throws Exception {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#807 is a blocker to parameterize this. Will do it once that is committed.

@ndimiduk
Copy link
Member

@bharathv this PR can be closed in favor of the individual subtasks, yeah?

@bharathv
Copy link
Contributor Author

@ndimiduk Yes.

Thanks to all the reviewers for the initial feedback. Those who are interested, please follow the PRs for subtasks of HBASE-18095.

@bharathv bharathv closed this Nov 21, 2019
@bharathv bharathv deleted the HBASE-18095 branch November 22, 2019 01:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants