Skip to content

Releases: octue/octue-sdk-python

Release/0.1.15

26 Apr 14:04
258f568

Choose a tag to compare

Contents

Fixes

  • Add from_string option to Serialisable.deserialise

Testing

  • Mock Google Pub/Sub Service, Topic, Subscription, Publisher and Subscriber in tests
  • Remove unneeded cleanup code from Service tests

Release/0.1.14

23 Apr 16:56
61fa92f

Choose a tag to compare

Contents

Breaking changes

  • Remove TagSet.__str__

Fixes

  • Use TagSet to deserialise tags in Datafile.from_cloud
  • Add custom (de)serialise methods to TagSet
  • Return subtags of a Tag in order using a FilterList
  • Remove separate dependencies copy/cache steps in Google Cloud Run Dockerfile so that it works for older versions of docker

Minor improvements

  • Remove absolute path from Dataset and Manifest serialisation
  • Add Serialisable.deserialise method
  • Add filter method to TagSet to avoid e.g. taggable.tags.tags.filter

Operations

  • Improve description of release workflow

Release/0.1.13

21 Apr 12:35
eb0817b

Choose a tag to compare

Contents

New features

  • Support setup.py and requirements-dev.txt in Cloud Run Dockerfile
  • Retrieve credentials from Google Cloud Secret Manager and inject into environment in Runner.run
  • Add ability to retrieve and update cloud files via the Datafile.download or Datafile.open methods
  • Allow cloud file attributes to be updated via Datafile.to_cloud method
  • Allow instantiation of TagSets from JSON-encoded lists

Breaking changes

  • Raise error if the datasets of the input manifest passed to Service.ask aren't all cloud-based

Fixes

  • Fix Dataset construction from serialised form in Manifest
  • Fix Datafile construction from serialised form in Dataset
  • Fix Datafile.deserialise
  • Adjust usages of tempfile.NamedTemporaryFile to also work on Windows
  • Add timeout and retry to Service.answer
  • Add retry to Service.wait_for_answer
  • Add 60 second timeout for answering question in Cloud Run deployment
  • Use correct environment variable for service ID in Cloud Run Dockerfile
  • Set _last_modified, size_bytes, and _hash_value to null values if a Datafile representing a cloud file is instantiated for a hypothetical cloud location (i.e. not synced to a cloud file at that point in time)
  • Allow Dataset.get_file_sequence use with no filter

Dependencies

  • Use new twined version that supports validation of credentials strand
  • Use newest version of gcp-storage-emulator

Minor improvements

  • Make path a positional argument of Datafile
  • Move gunicorn requirement into octue requirements
  • Raise warning instead of error if Google Cloud credentials environment variable is not found and return None as credentials
  • Move cloud code into new cloud subpackage
  • Raise TimeoutError in Service.wait_for_answer if no response is received by end of retries
  • Only look for deployment_configuration.json file in docker container /app directory
  • Ensure deployment_configuration.json file is always loaded correctly in docker container
  • Pass credentials strand into Runner instance in Cloud Run deployment
  • Add name attribute to Identifiable mixin
  • Add Google Cloud metadata to Datafile serialisation
  • Add deserialise method to Datafile
  • Add ability to add metadata to a Datafile instantiated from a regular cloud file
  • Use CRC32C hash value from Google Cloud when instantiating a Datafile from the cloud
  • Add ability to name Datafiles
  • Add ability to check whether a Datafile, all Datafiles in a Dataset, or all Datasets in a Manifest are located in Google Cloud
  • Use Datafile.deserialise when instantiating a Dataset from a dictionary
  • Add representation to GCPPubSubBackend
  • Load credentials strand JSON in Runner initialisation
  • Add location searched to message of error raised when app module can't be found in Runner.run
  • Ignore E203 flake8 warning

Testing

  • Remove subjective Service test test_serve_with_timeout
  • Use temporary file rather than temporary directory for tests where possible
  • Test Dataset.deserialise

Quality Checklist

  • New features are fully tested (No matter how much Coverage Karma you have)

Coverage Karma

  • If your PR decreases test coverage, do you feel you have built enough Coverage Karma* to justify it?

Release/0.1.12

26 Mar 21:02
70b4ec3

Choose a tag to compare

Contents

New Features

  • Add Google Cloud Run deployment for services

Breaking changes

  • Move most parameters from Runner.run to Runner.__init__ (this avoids the need for partial functions)
  • Split Service.answer into two methods
  • Return question UUID from Service.ask

Minor fixes and improvements

  • Use CRC32C hash function instead of Blake3 (due to extra requirements of Blake3 and the fact that Google Cloud uses CRC32C)
  • Use default Google credentials in Pub/Sub service if GCPPubSubBackend.credentials_environment_variable is None
  • Add representations to Topic and Subscription
  • Ensure all topic/subscription names start with their provided namespace (and ensure the namespace appears only once)
  • Give Services a random UUID as an ID if none is provided
  • Give GCPPubSubBackend a default value for the credentials environment variable
  • Ensure GCP Storage paths always have the correct path separator
  • Fix other Windows path issues
  • Remove unused copy_template function

Testing

  • Add automated testing for Windows and MacOS (in addition to Ubuntu)
  • Use tox for cross-platform testing
  • Use sys.executable instead of python in subprocess.Popen calls to ensure the virtual environment's python executable is used
  • Ensure test paths are agnostic of operating system

Quality Checklist

  • New features are fully tested (No matter how much Coverage Karma you have)

Coverage Karma

  • If your PR decreases test coverage, do you feel you have built enough Coverage Karma* to justify it?

Release/0.1.11

15 Mar 13:01
e4da82e

Choose a tag to compare

Contents

Minor fixes and improvements

  • Remove test bucket environment variable
  • Remove environment variable default argument from GoogleCloudStorageEmulator
  • Add installation, usage, and testing instructions to README

Testing

  • Test ability to start more than one Google Cloud Storage emulator at once

Release: 0.1.10

12 Mar 20:18
6b9e4b2

Choose a tag to compare

Contents

New Features

  • Move Google Cloud Storage emulator into octue package, making it importable

Minor fixes and improvements

  • Allow storage emulator to find and use a free port
  • Remove need for STORAGE_EMULATOR_HOST environment variable for tests
  • Avoid assuming custom metadata is set in storage client
  • Move unittest.TestResult method replacements into Google Cloud emulators module
  • Remove tox from CI tests, using just GitHub actions instead

Add Google Cloud Storage support; deprecate python < 3.8

10 Mar 16:19
aa00826

Choose a tag to compare

Contents

New Features

  • Add GoogleCloudStorageClient
  • Write manifest, its datasets, and its datafiles to cloud in Analysis.finalise (#96)
  • Closes #84 - add auto tag and release workflow
  • Allow Google Cloud storage blobs to be represented by Pathable
  • Add Datafile, Dataset, and Manifest to_cloud and from_cloud methods
  • Allow regular GCP files to be represented as Datafiles

Minor fixes and improvements

  • Add cloud storage emulator once for all tests
  • Add disk usage and file age utilities
  • Allow Datasets to have custom names
  • Add storage.path module akin to os.path but for Google Cloud Storage paths
  • Allow Hashables' hash values to be set
  • Pass GCP project and bucket names to tests from environment (#93)
  • Add ability to delete topic and subscription when a Service has finished serving
  • Facilitate graceful exit for serving Services on KeyboardInterrupt
  • Use latest versions of flake8, isort, and black in pre-commit and across all files (#87)
  • Fix CI test skipping flag
  • Fix documentation links (#92)

Breaking changes

  • Remove testing and explicit support for python3.6 and python3.7
  • Remove base_from from Pathable and replace with more transparent method
  • Rename Datafile.posix_timestamp to Datafile.timestamp and remove default value
  • Make Datafile.last_modified private
  • Rename persistence subpackage to storage

Testing

  • Test that children can question their own children as part of answering a question
  • Close #94 - delete topics and subscriptions at the end of each test
  • Remove timeouts from tests and replace with thread executor shutdown upon test pass, meaning that tests that connect to Google Pub/Sub won't fail because the connection is slower than expected

Child services, documentation, easier logging, and CI

01 Feb 15:06
7bc5f01

Choose a tag to compare

Contents

New Features

  • Enable use of child services - solving #46.
    • To solve #57 we need to be able to define and run local children as well as remote ones.
    • This means we must allow multiple services to run locally and independently...
    • Which probably means we can also solve octue/twined-server#2 at the same time
  • Enable Documentation Build and Serve, Update README #70
    • Ultimately we wish to unify documentation between twined and octue-sdk-python, but this is best done at the time of refactoring large chunks of octue-sdk-python into twined ( See #69 ) but at the moment we wish to just serve what we've got so we can at least link to it.
  • Add option to handle developer logs separately from Scientist logs (#78)
  • Allow skipping of CI tests if #skip_ci_tests is in the commit body - the use case is to reduce unnecessary computation when knowing the tests will fail for a commit but still wanting to commit.

Minor fixes and improvements

  • Implement a proper issue template, either derived from .github repo or applied directly (c.f. octue/twined#60 )
  • Close #32 - stop CLI tests leaving output files in working area.

Patch release: Add filtering capability and data input hashing

05 Jan 12:36
5bcf2b7

Choose a tag to compare

New Features

  • Add Hashable mixin to create data hashes of Datafiles and their containers, including metadata
  • Add Filterable mixin to make Datafiles in a Dataset and Tags in a TagSet filterable
  • Add FilterSet class for storing and facilitating filtering of Datafiles in Dataset and Tags in TagSet
  • Add FilterList for the output of ordering FilterSets
  • Add a plethora of type- and interface-based filters for use on FilterSets and FilterLists

Breaking changes

  • Any sha256 properties have been replaced with blake3_hash properties
  • Replace TagSet.has_tag with native `contains method
  • Rename TagSet.starts_with to TagSet.any_tag_starts_with
  • Rename TagSet.ends_with to TagSet.any_tag_ends_with

Minor fixes and improvements

  • Replace SHA256 hashing with BLAKE3 hashing - this is reportedly around 10 times as fast!
  • Neaten up #40/#42 by combining the two workflow files while still keeping separate checks on GitHub. This also makes the version check display less verbosely on GitHub
  • Add Tag class, neatening up the retrieval and filtering of subtags
  • Rename TagGroup to TagSet to reflect that it's set-based

Version consistency checks and Output Manifest fixes

18 Dec 12:56
0dd42ec

Choose a tag to compare

Contents

New Features

  • Closes #40 - Add Github setup.py/branch version consistency check
  • Codecov - README badge points to correct base branch

Minor fixes and improvements

  • Closes #45 - Analysis attributes are None when they shouldn't be