Skip to content

Commit 683ce0c

Browse files
committed
HBASE-23549 Document steps to disable MOB for a column family (#928)
Signed-off-by: Peter Somogyi <psomogyi@apache.org> Signed-off-by: Josh Elser <elserj@apache.org> (cherry picked from commit 17e180e)
1 parent b0145a4 commit 683ce0c

1 file changed

Lines changed: 157 additions & 0 deletions

File tree

src/main/asciidoc/_chapters/hbase_mob.adoc

Lines changed: 157 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -311,3 +311,160 @@ $> hdfs dfs -find /hbase -name \
311311
d41d8cd98f00b204e9800998ecf8427e19700118ffd9c244fe69488bbc9f2c77d24a3e6a
312312
/hbase/mobdir/data/default/some_table/372c1b27e3dc0b56c3a031926e5efbe9/foo/d41d8cd98f00b204e9800998ecf8427e19700118ffd9c244fe69488bbc9f2c77d24a3e6a
313313
----
314+
315+
==== Moving a column family out of MOB
316+
317+
If you want to disable MOB on a column family you must ensure you instruct HBase to migrate the data
318+
out of the MOB system prior to turning the feature off. If you fail to do this HBase will return the
319+
internal MOB metadata to applications because it will not know that it needs to resolve the actual
320+
values.
321+
322+
The following procedure will safely migrate the underlying data without requiring a cluster outage.
323+
Clients will see a number of retries when configuration settings are applied and regions are
324+
reloaded.
325+
326+
.Procedure: Stop MOB maintenance, change MOB threshold, rewrite data via compaction
327+
. Ensure the MOB compaction chore in the Master is off by setting
328+
`hbase.mob.file.compaction.chore.period` to `0`. Applying this configuration change will require a
329+
rolling restart of HBase Masters. That will require at least one fail-over of the active master,
330+
which may cause retries for clients doing HBase administrative operations.
331+
. Ensure no MOB compactions are issued for the table via the HBase shell for the duration of this
332+
migration.
333+
. Use the HBase shell to change the MOB size threshold for the column family you are migrating to a
334+
value that is larger than the largest cell present in the column family. E.g. given a table named
335+
'some_table' and a column family named 'foo' we can pick one gigabyte as an arbitrary "bigger than
336+
what we store" value:
337+
+
338+
----
339+
hbase(main):011:0> alter 'some_table', {NAME => 'foo', MOB_THRESHOLD => '1000000000'}
340+
Updating all regions with the new schema...
341+
9/25 regions updated.
342+
25/25 regions updated.
343+
Done.
344+
0 row(s) in 3.4940 seconds
345+
----
346+
+
347+
Note that if you are still ingesting data you must ensure this threshold is larger than any cell
348+
value you might write; MAX_INT would be a safe choice.
349+
350+
. Perform a major compaction on the table. Specifically you are performing a "normal" compaction and
351+
not a MOB compaction.
352+
+
353+
----
354+
hbase(main):012:0> major_compact 'some_table'
355+
0 row(s) in 0.2600 seconds
356+
----
357+
358+
. Monitor for the end of the major compaction. Since compaction is handled asynchronously you'll
359+
need to use the shell to first see the compaction start and then see it end.
360+
+
361+
HBase should first say that a "MAJOR" compaction is happening.
362+
+
363+
----
364+
hbase(main):015:0> @hbase.admin(@formatter).instance_eval do
365+
hbase(main):016:1* p @admin.get_compaction_state('some_table').to_string
366+
hbase(main):017:2* end
367+
“MAJOR”
368+
----
369+
+
370+
When the compaction has finished the result should print out "NONE".
371+
+
372+
----
373+
hbase(main):015:0> @hbase.admin(@formatter).instance_eval do
374+
hbase(main):016:1* p @admin.get_compaction_state('some_table').to_string
375+
hbase(main):017:2* end
376+
“NONE”
377+
----
378+
. Run the _mobrefs_ utility to ensure there are no MOB cells. Specifically, the tool will launch a
379+
Hadoop MapReduce job that will show a job counter of 0 input records when we've successfully
380+
rewritten all of the data.
381+
+
382+
----
383+
$> HADOOP_CLASSPATH=/etc/hbase/conf:$(hbase mapredcp) yarn jar \
384+
/some/path/to/hbase-shaded-mapreduce.jar mobrefs mobrefs-report-output some_table foo
385+
...
386+
19/12/10 11:38:47 INFO impl.YarnClientImpl: Submitted application application_1575695902338_0004
387+
19/12/10 11:38:47 INFO mapreduce.Job: The url to track the job: https://rm-2.example.com:8090/proxy/application_1575695902338_0004/
388+
19/12/10 11:38:47 INFO mapreduce.Job: Running job: job_1575695902338_0004
389+
19/12/10 11:38:57 INFO mapreduce.Job: Job job_1575695902338_0004 running in uber mode : false
390+
19/12/10 11:38:57 INFO mapreduce.Job: map 0% reduce 0%
391+
19/12/10 11:39:07 INFO mapreduce.Job: map 7% reduce 0%
392+
19/12/10 11:39:17 INFO mapreduce.Job: map 13% reduce 0%
393+
19/12/10 11:39:19 INFO mapreduce.Job: map 33% reduce 0%
394+
19/12/10 11:39:21 INFO mapreduce.Job: map 40% reduce 0%
395+
19/12/10 11:39:22 INFO mapreduce.Job: map 47% reduce 0%
396+
19/12/10 11:39:23 INFO mapreduce.Job: map 60% reduce 0%
397+
19/12/10 11:39:24 INFO mapreduce.Job: map 73% reduce 0%
398+
19/12/10 11:39:27 INFO mapreduce.Job: map 100% reduce 0%
399+
19/12/10 11:39:35 INFO mapreduce.Job: map 100% reduce 100%
400+
19/12/10 11:39:35 INFO mapreduce.Job: Job job_1575695902338_0004 completed successfully
401+
19/12/10 11:39:35 INFO mapreduce.Job: Counters: 54
402+
...
403+
Map-Reduce Framework
404+
Map input records=0
405+
...
406+
19/12/09 22:41:28 INFO mapreduce.MobRefReporter: Finished creating report for 'some_table', family='foo'
407+
----
408+
+
409+
If the data has not successfully been migrated out, this report will show both a non-zero number
410+
of input records and a count of mob cells.
411+
+
412+
----
413+
$> HADOOP_CLASSPATH=/etc/hbase/conf:$(hbase mapredcp) yarn jar \
414+
/some/path/to/hbase-shaded-mapreduce.jar mobrefs mobrefs-report-output some_table foo
415+
...
416+
19/12/10 11:44:18 INFO impl.YarnClientImpl: Submitted application application_1575695902338_0005
417+
19/12/10 11:44:18 INFO mapreduce.Job: The url to track the job: https://busbey-2.gce.cloudera.com:8090/proxy/application_1575695902338_0005/
418+
19/12/10 11:44:18 INFO mapreduce.Job: Running job: job_1575695902338_0005
419+
19/12/10 11:44:26 INFO mapreduce.Job: Job job_1575695902338_0005 running in uber mode : false
420+
19/12/10 11:44:26 INFO mapreduce.Job: map 0% reduce 0%
421+
19/12/10 11:44:36 INFO mapreduce.Job: map 7% reduce 0%
422+
19/12/10 11:44:45 INFO mapreduce.Job: map 13% reduce 0%
423+
19/12/10 11:44:47 INFO mapreduce.Job: map 27% reduce 0%
424+
19/12/10 11:44:48 INFO mapreduce.Job: map 33% reduce 0%
425+
19/12/10 11:44:50 INFO mapreduce.Job: map 40% reduce 0%
426+
19/12/10 11:44:51 INFO mapreduce.Job: map 53% reduce 0%
427+
19/12/10 11:44:52 INFO mapreduce.Job: map 73% reduce 0%
428+
19/12/10 11:44:54 INFO mapreduce.Job: map 100% reduce 0%
429+
19/12/10 11:44:59 INFO mapreduce.Job: map 100% reduce 100%
430+
19/12/10 11:45:00 INFO mapreduce.Job: Job job_1575695902338_0005 completed successfully
431+
19/12/10 11:45:00 INFO mapreduce.Job: Counters: 54
432+
...
433+
Map-Reduce Framework
434+
Map input records=1
435+
...
436+
MOB
437+
NUM_CELLS=1
438+
...
439+
19/12/10 11:45:00 INFO mapreduce.MobRefReporter: Finished creating report for 'some_table', family='foo'
440+
----
441+
+
442+
If this happens you should verify that MOB compactions are disabled, verify that you have picked
443+
a sufficiently large MOB threshold, and redo the major compaction step.
444+
. When the _mobrefs_ report shows that no more data is stored in the MOB system then you can safely
445+
alter the column family configuration so that the MOB feature is disabled.
446+
+
447+
----
448+
hbase(main):017:0> alter 'some_table', {NAME => 'foo', IS_MOB => 'false'}
449+
Updating all regions with the new schema...
450+
8/25 regions updated.
451+
25/25 regions updated.
452+
Done.
453+
0 row(s) in 2.9370 seconds
454+
----
455+
. After the column family no longer shows the MOB feature enabled, it is safe to start MOB
456+
maintenance chores again. You can allow the default to be used for
457+
`hbase.mob.file.compaction.chore.period` by removing it from your configuration files or restore
458+
it to whatever custom value you had prior to starting this process.
459+
. Once the MOB feature is disabled for the column family there will be no internal HBase process
460+
looking for data in the MOB storage area specific to this column family. There will still be data
461+
present there from prior to the compaction process that rewrote the values into HBase's data area.
462+
You can check for this residual data directly in HDFS as an HBase superuser.
463+
+
464+
----
465+
$ hdfs dfs -count /hbase/mobdir/data/default/some_table
466+
4 54 9063269081 /hbase/mobdir/data/default/some_table
467+
----
468+
+
469+
This data is spurious and may be reclaimed. You should sideline it, verify your application’s view
470+
of the table, and then delete it.

0 commit comments

Comments
 (0)