Skip to content
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
131 changes: 131 additions & 0 deletions bigtable/autoscaler/README.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
.. This file is automatically generated. Do not edit this file directly.

Google Cloud Bigtable Python Samples
===============================================================================

This directory contains samples for Google Cloud Bigtable. `Google Cloud Bigtable`_ is Google's NoSQL Big Data database service. It's the same database that powers many core Google services, including Search, Analytics, Maps, and Gmail.


This sample demonstrates using `Stackdriver monitoring`_,
to scale Cloud Bigtable based on CPU usage.

.. Stackdriver Monitoring: http://cloud.google.com/monitoring/docs


.. _Google Cloud Bigtable: https://cloud.google.com/bigtable/docs

Setup
-------------------------------------------------------------------------------


Authentication
++++++++++++++

Authentication is typically done through `Application Default Credentials`_,
which means you do not have to change the code to authenticate as long as
your environment has credentials. You have a few options for setting up
authentication:

#. When running locally, use the `Google Cloud SDK`_

.. code-block:: bash

gcloud auth application-default login


#. When running on App Engine or Compute Engine, credentials are already
set-up. However, you may need to configure your Compute Engine instance
with `additional scopes`_.

#. You can create a `Service Account key file`_. This file can be used to
authenticate to Google Cloud Platform services from any environment. To use
the file, set the ``GOOGLE_APPLICATION_CREDENTIALS`` environment variable to
the path to the key file, for example:

.. code-block:: bash

export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service_account.json

.. _Application Default Credentials: https://cloud.google.com/docs/authentication#getting_credentials_for_server-centric_flow
.. _additional scopes: https://cloud.google.com/compute/docs/authentication#using
.. _Service Account key file: https://developers.google.com/identity/protocols/OAuth2ServiceAccount#creatinganaccount

Install Dependencies
++++++++++++++++++++

#. Install `pip`_ and `virtualenv`_ if you do not already have them.

#. Create a virtualenv. Samples are compatible with Python 2.7 and 3.4+.

.. code-block:: bash

$ virtualenv env
$ source env/bin/activate

#. Install the dependencies needed to run the samples.

.. code-block:: bash

$ pip install -r requirements.txt

.. _pip: https://pip.pypa.io/
.. _virtualenv: https://virtualenv.pypa.io/

Samples
-------------------------------------------------------------------------------

Autoscaling example
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++



To run this sample:

.. code-block:: bash

$ python autoscaler.py

usage: autoscaler.py [-h] [--high_cpu_threshold HIGH_CPU_THRESHOLD]
[--low_cpu_threshold LOW_CPU_THRESHOLD]
[--short_sleep SHORT_SLEEP] [--long_sleep LONG_SLEEP]
instance_id

Scales Bigtable clusters based on CPU usage.

positional arguments:
instance_id ID of the Cloud Bigtable instance to connect to.

optional arguments:
-h, --help show this help message and exit
--high_cpu_threshold HIGH_CPU_THRESHOLD
If Bigtable CPU usages is above this threshold, scale
up
--low_cpu_threshold LOW_CPU_THRESHOLD
If Bigtable CPU usages is above this threshold, scale
up
--short_sleep SHORT_SLEEP
How long to sleep in seconds between checking metrics
after no scale operation
--long_sleep LONG_SLEEP
How long to sleep in seconds between checking metrics
after a scaling operation




The client library
-------------------------------------------------------------------------------

This sample uses the `Google Cloud Client Library for Python`_.
You can read the documentation for more details on API usage and use GitHub
to `browse the source`_ and `report issues`_.

.. Google Cloud Client Library for Python:
https://googlecloudplatform.github.io/google-cloud-python/
.. browse the source:
https://github.com/GoogleCloudPlatform/google-cloud-python
.. report issues:
https://github.com/GoogleCloudPlatform/google-cloud-python/issues


.. _Google Cloud SDK: https://cloud.google.com/sdk/
27 changes: 27 additions & 0 deletions bigtable/autoscaler/README.rst.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# This file is used to generate README.rst

product:
name: Google Cloud Bigtable
short_name: Cloud Bigtable
url: https://cloud.google.com/bigtable/docs
description: >
`Google Cloud Bigtable`_ is Google's NoSQL Big Data database service. It's
the same database that powers many core Google services, including Search,
Analytics, Maps, and Gmail.

description: |
This sample demonstrates using `Stackdriver monitoring`_,
to scale Cloud Bigtable based on CPU usage.

.. Stackdriver Monitoring: http://cloud.google.com/monitoring/docs

setup:
- auth
- install_deps

samples:
- name: Autoscaling example
file: autoscaler.py
show_help: true

cloud_client_library: true
134 changes: 134 additions & 0 deletions bigtable/autoscaler/autoscaler.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
# Copyright 2017 Google Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""Sample that demonstrates how to use Bigtable Stackdriver metrics to
autoscale Google Cloud Bigtable."""

import argparse
import os
import time

from google.cloud import bigtable
from google.cloud import monitoring

import strategies

CPU_METRIC = 'bigtable.googleapis.com/cluster/cpu_load'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while constants are good practice in production code, in sample code they add a layer of indirection that can be annoying when including a snippet in cloud site. Strongly prefer this to go near where it's used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done



def get_cpu_load():
"""Returns the most recent Bigtable CPU load measurement.

Returns:
float: The most recent Bigtable CPU usage metric
"""
client = monitoring.Client()
query = client.query(CPU_METRIC, minutes=5)
return list(query)[0].points[0].value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sample code, prefer pulling out discrete values and giving them names:

results = client.query(CPU_METRIC, minutes=5)
results = list(query)
first_result = results[0]
cpu_load = first_result.points[0].value

Also, shouldn't that query have a limit of 1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. API doesn't support limits on queries.



def scale_bigtable(bigtable_instance, up):
"""Scales the number of Bigtable nodes up or down.

Args:
bigtable_instance (str): Cloud Bigtable instance id to scale
up (bool): If true, scale up, otherwise scale down
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend scale_up as a more more descriptive name. Then, you can probably drop the args section and just describe the behavior in the description.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scale_up seems misleading in the case of downscale.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And up is less misleading?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, misunderstood, thought you meant the function name. Done.

"""
bigtable_client = bigtable.Client(admin=True)
instance = bigtable_client.instance(bigtable_instance)
instance.reload()

cluster = instance.cluster('{}-cluster'.format(bigtable_instance))
cluster.reload()

current_node_count = cluster.serve_nodes

if current_node_count <= 3 and not up:
# Can't downscale lower than 3 nodes
return

if up:
strategies_dict = strategies.UPSCALE_STRATEGIES
else:
strategies_dict = strategies.DOWNSCALE_STRATEGIES

strategy = strategies_dict['incremental']
new_node_count = strategy(cluster.serve_nodes)
cluster.serve_nodes = new_node_count
cluster.update()
print('Scaled from {} up to {} nodes.'.format(
current_node_count, new_node_count))


def main(
bigtable_instance,
high_cpu_threshold,
low_cpu_threshold,
short_sleep,
long_sleep):
"""Main loop runner that autoscales Bigtable.

Args:
bigtable_instance (str): Cloud Bigtable instance id to autoscale
high_cpu_threshold (float): If CPU is higher than this, scale up.
low_cpu_threshold (float): If CPU is higher than this, scale down.
short_sleep (int): How long to sleep after no operation
long_sleep (int): How long to sleep after the cluster nodes are
changed
"""
cluster_cpu = get_cpu_load()
print('Detected cpu of {}'.format(cluster_cpu))
if cluster_cpu > high_cpu_threshold:
scale_bigtable(bigtable_instance, True)
time.sleep(long_sleep)
elif cluster_cpu < low_cpu_threshold:
scale_bigtable(bigtable_instance, False)
time.sleep(short_sleep)
else:
print('CPU within threshold, sleeping.')
time.sleep(short_sleep)

if __name__ == '__main__':
parser = argparse.ArgumentParser(
description='Scales Bigtable clusters based on CPU usage.')
parser.add_argument(
'bigtable_instance', help='ID of the Cloud Bigtable instance to '
'connect to.')
parser.add_argument(
'--high_cpu_threshold',
help='If Bigtable CPU usages is above this threshold, scale up',
default=0.6)
parser.add_argument(
'--low_cpu_threshold',
help='If Bigtable CPU usages is above this threshold, scale up',
default=0.2)
parser.add_argument(
'--short_sleep',
help='How long to sleep in seconds between checking metrics after no '
'scale operation',
default=60)
parser.add_argument(
'--long_sleep',
help='How long to sleep in seconds between checking metrics after a '
'scaling operation',
default=60 * 10)
args = parser.parse_args()

while True:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move the while to main.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had it in main, but I moved it out here to make it easier to test main. Otherwise testing while True loops is annoying. Would probably have to make a bunch of changes just to test the code, this seemed like the simpler option.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests shouldn't influence flow control. We can figure out how to break out of the loop. Likely by using mock to insert a keyboard interrupt.

main(
args.bigtable_instance,
float(args.high_cpu_threshold),
float(args.low_cpu_threshold),
int(args.short_sleep),
int(args.long_sleep))
83 changes: 83 additions & 0 deletions bigtable/autoscaler/autoscaler_test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
# Copyright 2017 Google Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""Unit and system tests for autoscaler.py"""

import os
import time

from google.cloud import bigtable
from mock import patch

from autoscaler import get_cpu_load
from autoscaler import main
from autoscaler import scale_bigtable

BIGTABLE_INSTANCE = os.environ['BIGTABLE_CLUSTER']

# System tests to verify API calls succeed


def test_get_cpu_load():
assert get_cpu_load() > 0.0


def test_scale_bigtable():
bigtable_client = bigtable.Client(admin=True)
instance = bigtable_client.instance(BIGTABLE_INSTANCE)
instance.reload()

cluster = instance.cluster('{}-cluster'.format(BIGTABLE_INSTANCE))
cluster.reload()
original_node_count = cluster.serve_nodes

scale_bigtable(BIGTABLE_INSTANCE, True)

time.sleep(3)
cluster.reload()

new_node_count = cluster.serve_nodes
assert (new_node_count == (original_node_count + 2))

scale_bigtable(BIGTABLE_INSTANCE, False)
time.sleep(3)
cluster.reload()
final_node_count = cluster.serve_nodes
assert final_node_count == original_node_count


# Unit test for logic

@patch('time.sleep')
@patch('autoscaler.get_cpu_load')
@patch('autoscaler.scale_bigtable')
def test_main(scale_bigtable, get_cpu_load, sleep):
SHORT_SLEEP = 5
LONG_SLEEP = 10
get_cpu_load.return_value = 0.5

main(BIGTABLE_INSTANCE, 0.6, 0.3, SHORT_SLEEP, LONG_SLEEP)
scale_bigtable.assert_not_called()
scale_bigtable.reset_mock()

get_cpu_load.return_value = 0.7
main(BIGTABLE_INSTANCE, 0.6, 0.3, SHORT_SLEEP, LONG_SLEEP)
scale_bigtable.assert_called_once_with(BIGTABLE_INSTANCE, True)
scale_bigtable.reset_mock()

get_cpu_load.return_value = 0.2
main(BIGTABLE_INSTANCE, 0.6, 0.3, SHORT_SLEEP, LONG_SLEEP)
scale_bigtable.assert_called_once_with(BIGTABLE_INSTANCE, False)

scale_bigtable.reset_mock()
2 changes: 2 additions & 0 deletions bigtable/autoscaler/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
google-cloud-bigtable==0.23.1
google-cloud-monitoring==0.24.0
Loading