-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-18095: Zookeeper-less client connection implementation #781
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@apurtell / @wchevreuil FYI. Appreciate if you can give any initial feedback. |
|
💔 -1 overall
This message was automatically generated. |
wchevreuil
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for woking on this @bharathv ! Overall, looks good, just minor comments. Seems the newly added test failed the qabot run, can you review that together with the checkstyle and findbug messages?
hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncRegistry.java
Show resolved
Hide resolved
hbase-client/src/main/java/org/apache/hadoop/hbase/client/HMasterAsyncRegistry.java
Outdated
Show resolved
Hide resolved
| } | ||
|
|
||
| public static HBaseProtos.RegionLocation toRegionLocation(HRegionLocation loc) { | ||
| if (loc == null) return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just curious, did you face NPE conditions while testing this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really, but I think it could be null with my patch in error cases where HMaster is not accessible (referring to MetaRegionLocationCache#updateMetaLocation()).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this is a code smell. Better the caller should filter null locations, or that they don't happen at all (see my comments in MetaRegionLocationCache re: storing null values in the cache.
This method is also called by ReopenTableRegionsProcedure, so I am curious if there's a test that can exercise the null value.
| // Tracks any other connections created with custom client config. Used for testing clients with custom | ||
| // configurations. Tracked here so that they can be cleaned up on close() / restart. | ||
| private List<AsyncClusterConnection> customConnections = Collections.synchronizedList(new ArrayList<>()); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't quite follow why you needed to define these extra connections. Are you using both types of registry implementation on a single util instance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did this because I want to use different server and client side configs. The default Connection object built (in initConnection()) is using the server side conf. We cannot use the same conf for this patch, because we need the service side meta/master tracking to happen using ZK and just the client side to use HMasterAsyncRegistry. Is there a cleaner way to do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because we need the service side meta/master tracking to happen using ZK and just the client side to use HMasterAsyncRegistry
Oh, didn't know that was the intention. I thought it was just fine to leave even server side processes acting as client to use the master registry. Is the concern here related to an rpc overload?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have the same question. Only the masters need to do things differently. All other clients, including embedded server side clients, should use the new service? So we get a nice common interposition point between all clients and the source of truth. Except the masters themselves, of course, which need to go directly to the source of truth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@apurtell Agree, that makes the whole design simpler. That is my end goal.
| conf.set(HMasterAsyncRegistry.CONF_KEY, HostAndPort.fromParts(masterHostName, masterPort).toString()); | ||
| conf.set(AsyncRegistryFactory.REGISTRY_IMPL_CONF_KEY, HMasterAsyncRegistry.class.getName()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be set on TEST_UTIL config before starting the cluster, so that its async connection impl already uses HMasterAsyncRegistry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above, the reason I didn't do that is because that enables this registry on Region servers too. This is purely a client side config. Let me know if you know of a cleaner way to do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is purely a client side config
Yeah, that confused me. Am just wandering now what would be the impacts of having it also set at server side? Because that may become a common config mistake, if it's critical, maybe it's worth start thinking about split the connection creation path from client and server side?
| zk.getZNodePaths().getMetaReplicaIdFromZnode(input); | ||
| fail("Exception not hit getMetaReplicaIdFromZnode(): " + input); | ||
| } catch (NumberFormatException e) { | ||
| // Expected |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: may be we can use assertEquals with NFE's error message if available?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that will be tricky and probably needs more plumbing and not worth it I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine as it is, or if you prefer you can use mockito's expected functionality:
@Test(expected = NumberFormatException.class)
| final class AsyncRegistryFactory { | ||
|
|
||
| static final String REGISTRY_IMPL_CONF_KEY = "hbase.client.registry.impl"; | ||
| public static final String REGISTRY_IMPL_CONF_KEY = "hbase.client.registry.impl"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO unless some source code requires this to be public, we can keep this unchanged. For test case, we can directly use "hbase.client.registry.impl". Should be fine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replied in another comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should keep it as a public constant, as per @bharathv explanation on the other comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the idea here is the clients will instantiate one type of REGISTRY_IMPL, which contacts masters via RPC, and the masters will instantiate another type of REGISTRY_IMPL that talks to zookeeper directly? How is the distinction managed? Maybe I'll answer my own question upon further review...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same concern here. My understanding now, is that both server and client will rely on same config property, which may seem error prone to me. That's why I mention on my previous comment that maybe we should think on split the connection creation path between server and client based callers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with the comments. Like @apurtell mentioned in another comment, I will work towards making this the default registry implementation for both clients and internal service connections (except Master which obviously goes to ZK, the source of truth). There is nothing in the implementation that prevents this. It was only done because the HBaseMiniCluster used in tests picks random ports (for running concurrent tests) and the clients don't know before hand what would the correct master port to use in the config.
So if you see the pattern in tests, we wait for the mini cluster to be up, get the running master and it's port and then create a new Connection object based on that config. Once I figure out a way to force the mini-cluster to use certain known ports (without affecting the test concurrency ofcourse), we can get rid of the whole custom-config / split-config business. I'm looking into it. Hope it clarifies the intention.
| @InterfaceAudience.Private | ||
| public class HMasterAsyncRegistry implements AsyncRegistry { | ||
| private static final Logger LOG = LoggerFactory.getLogger(ZKAsyncRegistry.class); | ||
| public static final String CONF_KEY = "hbase.client.asyncregistry.masteraddrs"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, good to keep it private unless source code needs it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably already noticed, they are referred to in tests. Since these are "static final", they are immutable and read only. If we end up making it private and make copies of the string in tests, it is very difficult to refactor if one wants to change the config key (you'll have to grep + replace all the occurrences). I could be wrong but I see this pattern elsewhere in this project. (git grep "public static final" | egrep -v "Test|generated")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, let's keep this constant public.
hbase-client/src/main/java/org/apache/hadoop/hbase/client/HMasterAsyncRegistry.java
Outdated
Show resolved
Hide resolved
| private static final Logger LOG = LoggerFactory.getLogger(MetaRegionLocationCache.class); | ||
|
|
||
| // Maximum number of times we retry when ZK operation times out. Should this be configurable? | ||
| private static final int MAX_ZK_META_FETCH_RETRIES = 10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think better to make it configurable and may be keep 10(not sure about right value) as default in the absence of site config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't think it really needs to be configured. It if really comes to that point, something is pretty badly screwed up. Let me think more about this and get back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, the control of no of retries is not supposed to be given to user?
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestHMasterAsyncRegistryRPCs.java
Outdated
Show resolved
Hide resolved
Generally when an HBaseClient tries to create a cluster Connection, it fetches a bunch of metadata from Zookeeper (like active master address, clusterID, meta locations etc) before it creates the underlying transport. However exposing ZK to all the clients is a DDOS risk and ZK connections in the past have caused issues by not timing out on blocking RPCs (more context in the JIRA). This patch attempts to remove this ZK dependency by making the client fetch all the meta information directly from list of client configured HMaster endpoints. The patch adds a a new AsyncRegistry implementation that encapsulates this logic of fetching this meta information from the provided master end points. New RPCs are added to the HMasters to help fetch this information. Meta HRL caching: ---------------- One critical piece of metadata needed by clients to query tables is meta HRegionLocations. These are fetched from ZK by default. Since this patch moves away from ZK, it adds an in-memory cache of these locations on both Active/StandBy HMasters. ZK Listeners are registered to keep the cache up-to-date. New client configs: ------------------ - 'hbase.client.asyncregistry.masteraddrs' Should be set to a list of comma separated HMaster host:port addresses. - Should be used in conjunction with 'hbase.client.registry.impl' set to HMasterAsyncRegistry class.
bharathv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking a look. Addressed/answered some comments/questions. Added more test coverage. Please let me know what you think.
| @InterfaceAudience.Private | ||
| public class HMasterAsyncRegistry implements AsyncRegistry { | ||
| private static final Logger LOG = LoggerFactory.getLogger(ZKAsyncRegistry.class); | ||
| public static final String CONF_KEY = "hbase.client.asyncregistry.masteraddrs"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably already noticed, they are referred to in tests. Since these are "static final", they are immutable and read only. If we end up making it private and make copies of the string in tests, it is very difficult to refactor if one wants to change the config key (you'll have to grep + replace all the occurrences). I could be wrong but I see this pattern elsewhere in this project. (git grep "public static final" | egrep -v "Test|generated")
hbase-client/src/main/java/org/apache/hadoop/hbase/client/HMasterAsyncRegistry.java
Outdated
Show resolved
Hide resolved
| } | ||
|
|
||
| public static HBaseProtos.RegionLocation toRegionLocation(HRegionLocation loc) { | ||
| if (loc == null) return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really, but I think it could be null with my patch in error cases where HMaster is not accessible (referring to MetaRegionLocationCache#updateMetaLocation()).
| // Tracks any other connections created with custom client config. Used for testing clients with custom | ||
| // configurations. Tracked here so that they can be cleaned up on close() / restart. | ||
| private List<AsyncClusterConnection> customConnections = Collections.synchronizedList(new ArrayList<>()); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did this because I want to use different server and client side configs. The default Connection object built (in initConnection()) is using the server side conf. We cannot use the same conf for this patch, because we need the service side meta/master tracking to happen using ZK and just the client side to use HMasterAsyncRegistry. Is there a cleaner way to do this?
hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncRegistry.java
Show resolved
Hide resolved
| conf.set(HMasterAsyncRegistry.CONF_KEY, HostAndPort.fromParts(masterHostName, masterPort).toString()); | ||
| conf.set(AsyncRegistryFactory.REGISTRY_IMPL_CONF_KEY, HMasterAsyncRegistry.class.getName()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above, the reason I didn't do that is because that enables this registry on Region servers too. This is purely a client side config. Let me know if you know of a cleaner way to do this.
| zk.getZNodePaths().getMetaReplicaIdFromZnode(input); | ||
| fail("Exception not hit getMetaReplicaIdFromZnode(): " + input); | ||
| } catch (NumberFormatException e) { | ||
| // Expected |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that will be tricky and probably needs more plumbing and not worth it I guess.
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestHMasterAsyncRegistryRPCs.java
Outdated
Show resolved
Hide resolved
| private static final Logger LOG = LoggerFactory.getLogger(MetaRegionLocationCache.class); | ||
|
|
||
| // Maximum number of times we retry when ZK operation times out. Should this be configurable? | ||
| private static final int MAX_ZK_META_FETCH_RETRIES = 10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't think it really needs to be configured. It if really comes to that point, something is pretty badly screwed up. Let me think more about this and get back.
| final class AsyncRegistryFactory { | ||
|
|
||
| static final String REGISTRY_IMPL_CONF_KEY = "hbase.client.registry.impl"; | ||
| public static final String REGISTRY_IMPL_CONF_KEY = "hbase.client.registry.impl"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replied in another comment.
|
I configured the checkstyle plugin and fixed most of the checkstyle issues and messy indents. Still figuring out the import order mess. For some reason it is never happy, whatever way I fix the imports. |
|
💔 -1 overall
This message was automatically generated. |
| parseHortPorts(); | ||
| // Passing the default cluster ID means that the token based authentication does not work for | ||
| // certain client implementations. | ||
| // TODO(bharathv): Figure out a way to fetch the CLUSTER ID using a non authenticated way. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thought from the JIRA discussion was to add a servlet which does not require authentication. So, a separate instance of the infoserver stack, simply serving the cluster ID. A bit heavyweight, but an alternative like a separate instance of the RPC stack with auth requirements disabled serving only one iface/method isn't any less heavy.
| } | ||
|
|
||
| @VisibleForTesting | ||
| AsyncRegistry getRegistry() { return registry; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. Fine for master and branch-2, but note a headache here for branch-1 (if a backport is desired). The precursor implementation in branch-1 is ClusterRegistry. AsyncRegistry was a big refactor via HBASE-16835. Not suggesting this needs be different, and some simple substitutions may get you most of the way there. Just want to point this out.
| final class AsyncRegistryFactory { | ||
|
|
||
| static final String REGISTRY_IMPL_CONF_KEY = "hbase.client.registry.impl"; | ||
| public static final String REGISTRY_IMPL_CONF_KEY = "hbase.client.registry.impl"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the idea here is the clients will instantiate one type of REGISTRY_IMPL, which contacts masters via RPC, and the masters will instantiate another type of REGISTRY_IMPL that talks to zookeeper directly? How is the distinction managed? Maybe I'll answer my own question upon further review...
| @InterfaceAudience.Private | ||
| public class HMasterAsyncRegistry implements AsyncRegistry { | ||
| private static final Logger LOG = LoggerFactory.getLogger(HMasterAsyncRegistry.class); | ||
| public static final String CONF_KEY = "hbase.client.asyncregistry.masteraddrs"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: This doesn't need "asynchregistry" in the key name. We can anticipate multiple consumers of "hbase.client.master.addrs" (suggestion): HMasterAsyncRegistry, HMasterClusterRegistry (in branch-1), something else...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or even "hbase.master.addrs"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you make this CONF_KEY more specific/ meaningful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we're nit-picking, I prefer to see all public static grouped separate from private static, and the public interface coming earlier in the file.
| masterServers.add(ServerName.valueOf(hostPort, ServerName.NON_STARTCODE)); | ||
| } | ||
| Preconditions.checkArgument(!masterServers.isEmpty(), String.format("%s is empty", CONF_KEY)); | ||
| // Randomize so that not every client sends requests in the same order. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good
| * master found. | ||
| */ | ||
| private String getClusterIdHelper() throws MasterNotRunningException { | ||
| // Loop through all the masters serially. We could be hitting some standby masters which cannot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's worth asking: Why do we need an active master to tell us what the cluster ID is? All the masters can look at the file in HDFS regardless of role. Standbys aren't special here, they can only serve one cluster, just like the active. If we fix this so any master can respond to this query then it's a small improvement in overall availability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thats a good point. I'm looking into it.
| throws ServiceException { | ||
| GetClusterIdResponse.Builder response = GetClusterIdResponse.newBuilder(); | ||
| try { | ||
| master.checkInitialized(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my above comment. Initialize just this bit of info from HDFS if needed. Does not require active role.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack.
| // Tracks any other connections created with custom client config. Used for testing clients with custom | ||
| // configurations. Tracked here so that they can be cleaned up on close() / restart. | ||
| private List<AsyncClusterConnection> customConnections = Collections.synchronizedList(new ArrayList<>()); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have the same question. Only the masters need to do things differently. All other clients, including embedded server side clients, should use the new service? So we get a nice common interposition point between all clients and the source of truth. Except the masters themselves, of course, which need to go directly to the source of truth.
| * helper should only used if one wants to test a custom client side configuration that differs from the conf used to | ||
| * spawn the mini-cluster. | ||
| */ | ||
| public AsyncClusterConnection getCustomConnection(Configuration conf) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is confusing.
"Custom"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will try to get rid of this.
| hostAndPorts.add(HostAndPort.fromParts(masterHostName, masterPort)); | ||
| final String config = Joiner.on(",").join(hostAndPorts); | ||
| conf.set(HMasterAsyncRegistry.CONF_KEY, config); | ||
| conf.set(AsyncRegistryFactory.REGISTRY_IMPL_CONF_KEY, HMasterAsyncRegistry.class.getName()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I see where this is set for testing, but not where it's established as a default for clients.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should somehow distinguish between master and not-master roles and choose the right one, allowing for custom config override.
bharathv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just flushing out some design comments.
| final class AsyncRegistryFactory { | ||
|
|
||
| static final String REGISTRY_IMPL_CONF_KEY = "hbase.client.registry.impl"; | ||
| public static final String REGISTRY_IMPL_CONF_KEY = "hbase.client.registry.impl"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with the comments. Like @apurtell mentioned in another comment, I will work towards making this the default registry implementation for both clients and internal service connections (except Master which obviously goes to ZK, the source of truth). There is nothing in the implementation that prevents this. It was only done because the HBaseMiniCluster used in tests picks random ports (for running concurrent tests) and the clients don't know before hand what would the correct master port to use in the config.
So if you see the pattern in tests, we wait for the mini cluster to be up, get the running master and it's port and then create a new Connection object based on that config. Once I figure out a way to force the mini-cluster to use certain known ports (without affecting the test concurrency ofcourse), we can get rid of the whole custom-config / split-config business. I'm looking into it. Hope it clarifies the intention.
| * master found. | ||
| */ | ||
| private String getClusterIdHelper() throws MasterNotRunningException { | ||
| // Loop through all the masters serially. We could be hitting some standby masters which cannot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thats a good point. I'm looking into it.
| throws ServiceException { | ||
| GetClusterIdResponse.Builder response = GetClusterIdResponse.newBuilder(); | ||
| try { | ||
| master.checkInitialized(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack.
| // Tracks any other connections created with custom client config. Used for testing clients with custom | ||
| // configurations. Tracked here so that they can be cleaned up on close() / restart. | ||
| private List<AsyncClusterConnection> customConnections = Collections.synchronizedList(new ArrayList<>()); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@apurtell Agree, that makes the whole design simpler. That is my end goal.
| * helper should only used if one wants to test a custom client side configuration that differs from the conf used to | ||
| * spawn the mini-cluster. | ||
| */ | ||
| public AsyncClusterConnection getCustomConnection(Configuration conf) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will try to get rid of this.
| public class HMasterAsyncRegistry implements AsyncRegistry { | ||
| private static final Logger LOG = LoggerFactory.getLogger(HMasterAsyncRegistry.class); | ||
| public static final String CONF_KEY = "hbase.client.asyncregistry.masteraddrs"; | ||
| private static final String DEFAULT_HOST_PORT = "localhost:" + HConstants.DEFAULT_MASTER_PORT; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this "localhost" being default value for DEFAULT_HOST_PORT useful?
| import org.slf4j.LoggerFactory; | ||
|
|
||
| /** | ||
| * Fetches the meta information directly from HMaster by making relevant RPCs. HMaster RPC end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: endpoint should be one word.
| @InterfaceAudience.Private | ||
| public class HMasterAsyncRegistry implements AsyncRegistry { | ||
| private static final Logger LOG = LoggerFactory.getLogger(HMasterAsyncRegistry.class); | ||
| public static final String CONF_KEY = "hbase.client.asyncregistry.masteraddrs"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you make this CONF_KEY more specific/ meaningful?
| public HMasterAsyncRegistry(Configuration config) { | ||
| masterServers = new ArrayList<>(); | ||
| conf = config; | ||
| parseHortPorts(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo here?
HortPort?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| // certain client implementations. | ||
| // TODO(bharathv): Figure out a way to fetch the CLUSTER ID using a non authenticated way. | ||
| rpcClient = RpcClientFactory.createClient(conf, HConstants.CLUSTER_ID_DEFAULT); | ||
| rpcTimeout = (int) Math.min(Integer.MAX_VALUE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you do toNanos here, is int good enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like int is used throughout our RPC call stack. It's not well documented, but I think this timeout is supposed to be milliseconds, not nanoseconds. I followed method calls and member variables back to HBaseRpcControllerImpl#callTimeout.
| } | ||
| if (result == null || result.isEmpty()) { | ||
| throw new MetaRegionsNotAvailableException(String.format( | ||
| "Meta locations not found. Probed masters: %s", conf.get(CONF_KEY, DEFAULT_HOST_PORT))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:"Meta location not found"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was meant to be for multiple replicas of meta region..
| } | ||
|
|
||
| /** | ||
| * Picks the first master entry from 'masterHortPorts' to fetch the meta region locations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo:
masterHortPorts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, nice catch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might as well make this an {@link #methodName}.
But actually the method definition in the interface has a javadoc, so having one here doesn't add much.
| GetClusterIdResponse resp = | ||
| stub.getClusterId(rpcController, GetClusterIdRequest.getDefaultInstance()); | ||
| return resp.getClusterId(); | ||
| } catch (IOException e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so, what's the point separating IOException and ServiceException here?
can you combine them?
| populateInitialMetaLocations(); | ||
| } | ||
|
|
||
| private void populateInitialMetaLocations() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we do exponential backoff retry here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. Should follow our existing RPC retry patterns, configurable for operators to tweak, &c. See RetryCounterFactory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah this is cool. Done.
| } | ||
| RegionState state = null; | ||
| int retries = 0; | ||
| while (retries++ < MAX_ZK_META_FETCH_RETRIES) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above : should we do exponential backoff retry here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nod
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
ndimiduk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great!
| DELETED | ||
| }; | ||
|
|
||
| public MetaRegionLocationCache(ZKWatcher zkWatcher) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be package-private.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| populateInitialMetaLocations(); | ||
| } | ||
|
|
||
| private void populateInitialMetaLocations() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. Should follow our existing RPC retry patterns, configurable for operators to tweak, &c. See RetryCounterFactory.
| } | ||
| RegionState state = null; | ||
| int retries = 0; | ||
| while (retries++ < MAX_ZK_META_FETCH_RETRIES) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nod
| updateMetaLocation(path, ZNodeOpType.INIT); | ||
| } | ||
| break; | ||
| } catch (KeeperException.OperationTimeoutException e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is OperationTimeoutException the only subclass that should be retired? What about ConnectionLossException, SessionExpiredException, or SessionMovedException? Should this class attempt to gracefully ride over these as well, or is that the responsibility of a higher power?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Switched the code to retry for all KeeperExceptions.
| } | ||
| } | ||
| if (state == null) { | ||
| cachedMetaLocations.put(replicaId, null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the intention behind of storing null? Probably you want to delete the entry instead. If there's a reason to know that the received state was null, better to have an explicit enum variant, UNKNOWN for example. I don't really see how that's useful to a client, but it's more informative than null.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thats a fair point. I've updated the code to delete the entry.
| stub.getClusterId(rpcController, GetClusterIdRequest.getDefaultInstance()); | ||
| return resp.getClusterId(); | ||
| } catch (IOException e) { | ||
| LOG.warn("Error fetching cluster ID from master: {}", sname, e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment re: log level and using e.getMessage().
| "Meta locations not found. Probed masters: %s", conf.get(CONF_KEY, DEFAULT_HOST_PORT))); | ||
| } | ||
| List<HRegionLocation> deserializedResult = new ArrayList<>(); | ||
| result.stream().forEach( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
result.forEach().
| returns(GetClusterStatusResponse); | ||
|
|
||
| /** Returns whether this master is active or not. Served on both active/standby masters.*/ | ||
| rpc IsActive(IsActiveRequest) returns(IsActiveResponse); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is the active master identified? Is it possible that the standby masters maintain awareness of the active? If so, it should be possible to remove this boolean query RPC and instead have any master reply with the current master if it is unable to service the request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#812 - You are already a reviewer on that :-)
|
|
||
| import static org.apache.hadoop.hbase.client.RegionReplicaTestHelper.testLocator; | ||
|
|
||
| import org.apache.commons.io.IOUtils; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unused import.
| * Tests basic create, put, scan operations using the connection. | ||
| */ | ||
| @Test | ||
| public void testCustomConnectionBasicOps() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we not parameterize the various TestFromClientSide* classes to use both the new and the old methods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#807 is a blocker to parameterize this. Will do it once that is committed.
bharathv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| } | ||
| if (result == null || result.isEmpty()) { | ||
| throw new MetaRegionsNotAvailableException(String.format( | ||
| "Meta locations not found. Probed masters: %s", conf.get(CONF_KEY, DEFAULT_HOST_PORT))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was meant to be for multiple replicas of meta region..
| } | ||
|
|
||
| /** | ||
| * Picks the first master entry from 'masterHortPorts' to fetch the meta region locations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, nice catch.
| DEFAULT_HBASE_RPC_TIMEOUT))); | ||
| } | ||
|
|
||
| private void parseHortPorts() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| public HMasterAsyncRegistry(Configuration config) { | ||
| masterServers = new ArrayList<>(); | ||
| conf = config; | ||
| parseHortPorts(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| private AssignmentManager assignmentManager; | ||
|
|
||
| // Cache of meta locations indexed by replicas | ||
| private MetaRegionLocationCache metaRegionLocationCache; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| updateMetaLocation(path, ZNodeOpType.INIT); | ||
| } | ||
| break; | ||
| } catch (KeeperException.OperationTimeoutException e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Switched the code to retry for all KeeperExceptions.
| if (!isValidMetaZNode(path)) { | ||
| return; | ||
| } | ||
| LOG.info("Meta znode for path {}: {}", path, opType.name()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Switched to debug. Didn't fully get what you meant in the last statement, but the intention for this logging here is to collate the timestamps in case there are issues like meta locations seen by clients are stale etc.
| returns(GetClusterStatusResponse); | ||
|
|
||
| /** Returns whether this master is active or not. Served on both active/standby masters.*/ | ||
| rpc IsActive(IsActiveRequest) returns(IsActiveResponse); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#812 - You are already a reviewer on that :-)
| .get(); | ||
| util.getAdmin().move(regionInfo.getEncodedNameAsBytes(), newServerName); | ||
| if (regionInfo.isMetaRegion()) { | ||
| // Invalidate the meta cache forcefully to avoid test races. Otherwise there might be a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Undone.
| * Tests basic create, put, scan operations using the connection. | ||
| */ | ||
| @Test | ||
| public void testCustomConnectionBasicOps() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#807 is a blocker to parameterize this. Will do it once that is committed.
|
@bharathv this PR can be closed in favor of the individual subtasks, yeah? |
|
@ndimiduk Yes. Thanks to all the reviewers for the initial feedback. Those who are interested, please follow the PRs for subtasks of HBASE-18095. |
Generally when a HBaseClient tries to create a cluster
Connection, it fetches a bunch of metadata from Zookeeper
(like active master address, clusterID, meta locations etc)
before it creates the underlying transport. However exposing
ZK to all the clients is a DDOS risk and ZK connections in
the past have caused issues by not timing out on blocking
RPCs (more context in the JIRA).
This patch attempts to remove this ZK dependency by making
the client fetch all the meta information directly from list
of client configured HMaster endpoints. The patch adds a
a new AsyncRegistry implementation that encapsulates this logic
of fetching this meta information from the provided master
end points. New RPCs are added to the HMasters to help fetch
this information.
Meta HRL caching:
One critical piece of metadata needed by clients to query tables
is meta HRegionLocations. These are fetched from ZK by default.
Since this patch moves away from ZK, it adds an in-memory cache
of these locations on both Active/StandBy HMasters. ZK Listeners
are registered to keep the cache up-to-date.
New client configs:
list of comma separated HMaster host:port addresses.
set to HMasterAsyncRegistry class.
not accessible etc).