Releases: octue/octue-sdk-python
Speed up event replaying
Contents (#669)
Enhancements
- Skip non-result event validation if only result is required
- Add ability to skip event validation in event handlers
- Make diagnostics log messages more consistent
- Allow instantiation of
Diagnostics,Topic,Subscription, andGoogleCloudPubSubEventHandlerwithout cloud credentials
Refactoring
- Update from deprecated
datetime.datetime.utcnowmethod - Use
cached_propertyinService - Remove unused attributes on
MockServiceandRunner
Testing
- Implement
MockSubscription.delete
Update child emulator and improve manifest dataset download
Contents (#668)
IMPORTANT: There are 2 breaking changes.
Enhancements
- 💥 BREAKING CHANGE: Update
ChildEmulatorto useEventReplayer, support schema-compliant events and attributes, and support heartbeats and delivery acknowledgement events. This significantly simplifies the emulator - 💥 BREAKING CHANGE: Remove
ChildEmulator.from_file - Download manifest datasets to same directory by default
Refactoring
- Move
ServicePatcherinto its own module
Upgrade instructions
💥 Update `ChildEmulator` to use `EventReplayer` and full events
Give events (including attributes) that satisfy the service communication schema to child emulators.
💥 Remove `ChildEmulator.from_file`
Load the JSON file separately and pass the events into the ChildEmulator constructor.
Enable question chaining
Summary
This release makes major improvements to event handling and question auditing. Some of the main changes are:
- Questions are now automatically associated with their parent question and the question that originated them, however deep they are in a question tree
- Events are ordered by datetime by the event backend, not the SDK
- Better feedback is provided when asking questions in parallel
- You can specify the event store to use
- Log message contexts have been slimmed down without losing any information, and events are replayable with no context (good for smaller screens)
- Various public classes and functions are faster and easier to use
- Question retries have the same question UUID
Contents (#660)
IMPORTANT: There are 6 breaking changes.
New features
Events
- 💥 BREAKING CHANGE: Add
parent_question_uuid,originator_question_uuid,originatorandretry_countevent attributes - Avoid redelivery of questions by checking the event store on delivery
Event handlers
- Add ability to not include service metadata in logs in even handlers
- Enable
EventReplayerto handle question events - Add
RegisteredTemporaryDirectoryclass, use it when downloading datasets, and add ability to delete them at end of analysis
Enhancements
Resources
- 💥 BREAKING CHANGE: Make datasets recursive by default in
Dataset - Log a warning if a dataset is empty at instantiation
Services
- 💥 BREAKING CHANGE: Remove
nameargument fromServiceand provide an SRUID toChildinternal service instead of a name - Improve logging of errors, retries, and threading in
Child.ask_multiple - Order pub/sub messages by datetime using ordering key and remove
orderevent attribute - Set question UUIDs in advance in
Child.ask_multiple
Subscriptions
- Allow existing subscriptions in
create_push_subscription - Give feedback on (un)successful push subscription creation in CLI
Questions and events
- Remove unnecessary
senderargument fromget_eventsand make getting the tail of events the default - Allow retried questions to have the same UUID
- Allow explicit question retries by using
retry_countattribute - Return empty list from
get_eventsif no events for question
Service configuration
- Allow setting of event store table ID and
delete_local_filesin service configuration - Use envvar to specify service configuration location by default
- Add
overridesoption toRunner.from_configuration
Other
- Log warning when
PYTHONUNBUFFEREDenvvar is unset - Remove "analysis-" from start of question UUIDs in log context
Fixes
- 💥 BREAKING CHANGE: Return question UUID alongside error from
Child.ask_multiplefor failed questions - Set analysis ID at start of
Runner.run - Emit correct logs when no diagnostics available with
octue get-diagnostics - Fix deserialisation of events in
get_events - Use (meta-)generation agnostic retry strategy with cloud storage
- Return correct question UUIDs with failed questions from
Child.ask_multiple - Avoid logging that app failed when it didn't when uploading diagnostics
- Allow setting of
max_workerswhen CPU count is indeterminate - Disable
delete_local_filesby default
Operations
- Update event handler and its bigquery table
Dependencies
- Loosen
Sphinxand other docs package ranges - Remove unneeded
db-dtypespackage - Make
google-cloud-bigquerya mandatory dependency - Upgrade
google-cloud-secret-manager
Refactoring
Event handlers
- 💥 BREAKING CHANGE: Remove redundant datetime from delivery ack and heartbeat events
- 💥 BREAKING CHANGE: Rename
originatorevent attribute toparent - Factor out finalising and cleaning up in
Runner - Move service accounts into separate terraform file
- Cache metadata against datafile/dataset instead of path
- Rename python3.9 dockerfile to reflect its python version
Upgrade instructions
- Add
recursive=FalsetoDatasetinstantiations - Update all services in your service network to use
octue>=0.56.0 - Use version
0.6.1of the event handler or above and a correspondingly up-to-date BigQuery table. - Swap the
internal_service_nameargument forinternal_sruidargument toChild.__init__and provide a valid SRUID - Instances of
Servicecan no longer be given names. Please give them a valid SRUID instead. - To get the unraised exception from a failed answer returned by
Child.ask_multiple, access the zeroth element e.g. if the third question failed:answers = Child.ask_multiple(*questions) exception, question_uuid = answers[3] Service.received_events,AbstractEventHandler.handled_events, andChild.received_eventsnow include event attributes instead of just the event. These attributes/properties now return a list of dictionaries with the keys {"event", "attributes"}, where what was previously returned is now mapped to the "event" key.- Stop providing the
recipientargument toEventReplayerandGoogleCloudPubSubEventHandler- it's now automatically acquired from each event's attributes - Stop passing the
skip_missing_events_afterargument toEventReplayerandGoogleCloudPubSubEventHandler - Stop using the
awaiting_missing_eventandtime_since_missing_eventproperties on the event handlers
Use updated `twined`
Contents (#650)
IMPORTANT: There is 1 breaking change.
Enhancements
- Include received invalid data in flask app error messages
- Allow any iterable for
Datasetfilesargument
Fixes
- Ensure
orderargument is given inService.send_exception - Add workaround for apparent bug in getting local metadata file's absolute path
- Remove (now-) unnecessary json decoding in
get_events
Dependencies
- 💥 BREAKING CHANGE: Drop support for python3.7
- Use
twined==0.5.5 - Update to
black==24.4.2
Testing
- Remove unnecessary test
- Update asynchronous deployment test to accept numpy array as output values
Upgrade instructions
💥 Drop support for python3.7
Upgrade to python>=3.8 to keep using octue.
Use twined version 0.5.5 to unpin jsonschema package
CHO: Add inter-service compatibility metadata skipci
Improve async event retrieval workflow
Contents (#647)
IMPORTANT: There are 2 breaking changes.
Enhancements
- 💥 BREAKING CHANGE: Return question UUID from
Child.ask - Deserialise manifests from events in
get_event - Raise error if no events found when calling
get_events
Fixes
- Use correct base image for
python3.11dockerfile - Return schema-compliant events and attributes from
get_events
Operations
- Import missing APIs into terraform config
- Deploy version
0.5.0of event handler cloud function and update event store schema - Update
actions/setup-pythonto version 5
Dependencies
- 💥 BREAKING CHANGE: Make
db-dtypesandgoogle-cloud-bigqueryoptional - Upgrade
gunicornto avoid vulnerability - Loosen
numpydependency
Testing
- Test retrieving results from real asynchronous question
- Run tests with
python3.10(python3.9isn't available onmacos-latestforarm64)
Other
- Add DOI badge to readme
Upgrade instructions
💥 Return question UUID from `Child.ask`
Instead of writing answer = Child.ask(...), write answer, question_uuid = Child.ask(...) (and the same for ChildEmulator)
💥 Make `db-dtypes` and `google-cloud-bigquery` optional
To keep using the get_events function, add the bigquery optional extra to your installation command e.g. poetry install -E bigquery.
Switch to event-driven infrastructure and improve support for asynchronous questions
Summary
This pull request:
- Makes the SDK fully event-driven by using a single topic to emit/consume events
- Majorly refactors the event handler to facilitate asynchronous event retrieval
- Adds the ability to get and replay events from a BigQuery store
Contents (#632)
IMPORTANT: There are 4 breaking changes.
New features
- 💥 BREAKING CHANGE: Use single topic per workspace (#639)
- Add
get_eventsfunction for retrieving events asynchronously from BigQuery - Add
EventReplayerclass to replay asynchronously-retrieved events - Add
Manifest.downloadmethod
Enhancements
- Get subscription project name from topic by default
- Improve asking of asynchronous questions via
Child.ask - Return download path from
Dataset.download - Include question UUID in delivery acknowledgement log message
- Improve handling of invalid events
- Add
datetimeanduuidattributes to all events
Fixes
- Await successful publishing of question messages
- Fix
api_access_endpointusage inmock_generate_signed_url
Operations
- Add test BigQuery dataset, cloud function, and IAM roles to terraform config
- Switch to reusable workflows where possible
Dependencies
- Add
google-cloud-bigquery - Upgrade
coolname - Add
db-dtypesfor converting bigquery rows to dataframes
Refactoring
- 💥 BREAKING CHANGE: Rename
x.received_messagestox.received_events - 💥 BREAKING CHANGE: Rename
record_messagesparameters torecord_events - 💥 BREAKING CHANGE: Update
ChildEmulatorto useevent*instead ofmessage* - Factor out making minimal dictionary
- Factor out creating push subscription
- Factor out emitting question event in
Service.ask - Factor out event handlers and related logic from
OrderedMessageHandlerinto newAbstractEventHandler - Move
validationmodule intooctue.cloud.eventssubpackage - Rename
OrderedMessageHandlertoGoogleCloudPubSubEventHandler - Rename "message" to "event" in event handler classes
- Rename
GooglePubSubHandlertoGoogleCloudPubSubHandler
Chores
- Update licence year to 2024
Testing
- Simplify various tests
Upgrade instructions
- Update all services in your services network to this version of
octueor later (0.53.0+) - Replace any usages of the
received_messagesmethods withreceived_events - Replace any usages of the
record_messagesparameters withrecord_events - Replace the word
messagewitheventin usages ofChildEmulatormethods (apart from in the case ofmonitor_message)
Warn about messages with duplicate message numbers
Make event handling faster and resilient to missing events
Contents (#625)
IMPORTANT: There is 1 breaking change.
Enhancements
- Allow setting of maximum number of workers for parallel questions in
Child.ask_multiple - Pull up to 50 messages from answer subscriptions at once instead of 1
- Allow skipping of any missing message after a 10s delay in
OrderedMessageHandler - Suppress name/namespace override warning if the value is the same in the environment and service configuration file
- Speed up event validation by caching service communication JSON schema
- 💥 BREAKING CHANGE: Extract SRUID for child logs context from subscription in message handler
Fixes
- Exit early from message pulling if heartbeat check fails
- Make
Manifest.update_dataset_pathsmethod thread-safe
Refactoring
- Factor out multiple checks of package version in message handler
Testing
- Improve message handling tests by not mocking
_pull_and_enqueue_available_messagesmethod and removingMockMessagePuller
Upgrade instructions
💥 Extract SRUID for child logs from subscription in message handler
This removes the service_name argument from Service.wait_for_answer. If you were using this argument, simply remove it; logs from children shown in a parent will now have the full and correct SRUID automatically.
Publish answers to question topic
Summary
This pull request removes the use of answer topics by publishing answer messages to the service revision (formerly known as question) topic and filtering subscriptions to only receive a) questions or b) response messages to a specific question. This speeds up the question asking process, reduces cloud infrastructure requirements and the permissions surface, and allows us to avoid topic number limits.
Also added is validation of all messages and their attributes against a new publicly available schema. This ensures services are communicating as they should and opens up the possibility of writing services in other languages and creating emulators.
As this, by itself, constitutes an inter-service communication breaking change, we've taken the opportunity to reduce the complexity of the codebase by removing backwards compatibility patches for service communication (i.e. we've grouped multiple breaking changes together into one).
Contents (#603)
IMPORTANT: There are 7 breaking changes.
New features
- Validate messages and their attributes against new service communication schema (see #614 for changelog - it was merged into this branch)
- Allow diagnostics (formerly known as crash diagnostics) to always be switched on for a service
Enhancements
- 💥 BREAKING CHANGE: Publish responses to questions to the service revision (question) topic instead of creating a separate answer topic
- 💥 BREAKING CHANGE: Store message number in message attributes instead of in message data
- 💥 BREAKING CHANGE: Remove question UUID from log record message body
- 💥 BREAKING CHANGE: Remove inter-service communication backwards compatibility code
- 💥 BREAKING CHANGE: Make input and output values and manifest optional
- 💥 BREAKING CHANGE: Replace boolean
allow_save_diagnostics_data_on_crashargument with string/enumsave_diagnosticsargument inService.askand related methods - Add ability to filter subscriptions
- Add question UUID attribute to all messages
- Send more possible errors to parent in
Service.answer - Add
kindfield to question messages - Add
sender_typeattribute to all messages - Add ability to instantiate
Runnerfrom service/app configurations
Fixes
- Stop double-JSON-encoding output manifests
Dependencies
- Update
octueversion in template apps' dependencies
Refactoring
- 💥 BREAKING CHANGE: Rename crash diagnostics to diagnostics
- Group message attributes in
Service._send_messageandMockMessageunder explicitattributesargument - Make
OrderedMessageHandler._waiting_messagesattribute public - Rename various message attributes
Testing
- Store mock pub/sub messages against subscriptions instead of topics
- Add missing
typefield to emulated Pub/Sub questions
Operations
- Fix
add-issues-to-octue-boardworkflow - Stop automatically building docker images for registry in
releaseworkflow - Add ReadTheDocs config file to fix documentation building
Upgrade instructions
💥 Update all Octue services in your network to use this version of octue so they're still able to communicate. Postpone upgrading until you can upgrade all services simultaneously.
💥 Replace allow_save_diagnostics_data_on_crash with save_diagnostics set to one of these values: "SAVE_DIAGNOSTICS_OFF", "SAVE_DIAGNOSTICS_ON_CRASH", or "SAVE_DIAGNOSTICS_ON"
💥 Crash diagnostics rename:
- Use the
octue get-diagnosticsCLI command instead of theoctue get-crash-diagnosticscommand - Rename
crash_diagnostics_cloud_pathin your service configurations todiagnostics_cloud_path