The MaskFill utility works with gridded data, applying a fill value in all pixels
outside of a provided shape. This utility is now available via a Harmony service.
The utility accepts HDF-5 and NetCDF-4 files that follow CF conventions and GeoTIFFs.
MaskFill was developed using the Anaconda distribution of Python (https://www.anaconda.com/download) and conda virtual environment. This simplifies dependency management. Run these commands to create a MaskFill conda virtual environment and install all the needed packages:
conda create --name maskfill --file conda_requirements.txt \
python=3.12 --channel conda-forge --override-channels
conda activate maskfill
pip install -r pip_requirements.txt- Commit messages should use the ticket number as a prefix,
e.g.:
DAS-123: Awesome feature description. - Commit history should be squashed locally, to avoid minor commits (e.g.:
fix typo,update README). This can be done via an interactive rebase, whereNis the number of commits added during the feature development:git rebase -i HEAD~N
The Harmony version is a semantic version number (major.minor.patch), which
should be iterated every release. It is contained in the
docker/service_version.txt file. When making any update to the service code,
the version number in this file should be updated before making a pull request.
The general rules for iterating a semantic version number are:
- Major: When API changes are made to the service that are not backwards compatible.
- Minor: When functionality is added in a backwards compatible way.
- Patch: Used for backwards compatible bug fixes or performance improvements.
When the Docker image is built, it will be tagged with the semantic version
number as stored in docker/service_version.txt.
The CICD for MaskFill is contained in GitHub workflows in the .github/workflows
directory:
run_tests.yml- A reusable workflow that builds the service and test Docker images, then runs the Python unit test suite in an instance of the test Docker container.run_tests_on_pull_requests.yml- Triggered for all PRs against themainbranch. It runs the workflow inrun_tests.ymlto ensure all tests pass for the new code.publish_docker_image.yml- Triggered either manually or for commits to themainbranch that contain changes to thedocker/service_version.txtfile.
The publish_docker_image.yml workflow will:
- Run the full unit test suite, to prevent publication of broken code.
- Extract the semantic version number from
docker/service_version.txt. - Extract the release notes for the most recent version from
CHANGELOG.md - Build the service Docker image and push it to the GitHub Container Registry.
- Create a GitHub release that will also tag the related git commit with the semantic version number.
Before triggering a release, ensure both the docker/service_version.txt and
CHANGELOG.md files are updated. The CHANGELOG.md file requires a specific
format for a new release, as it looks for the following string to define the
newest release of the code (starting at the top of the file).
## vX.Y.Z
The best method to run Harmony locally is to have a local instance of Harmony running that is configured to use the MaskFill service. Requests can then be made as they would for any other environment (production, UAT, SIT) via:
- harmony-py
- cURL
- A URL placed in a browser window, pointing at
localhost:3000.
This project has unit tests that utilize the standard unittest Python
package. These can be run from the root directory of this repository using the
following commands:
export ENV=test
python -m unittest discover testsThe environment variable ENV must be set to ensure that all unit tests that
invoke the MaskFillAdapter class do not try to stage their output files.
The unit tests also contain basic tests for code style, ensuring that all Python files conform to PEP8, excluding checks on line-length.
Tests within tests/test_maskfill.py are designed to test the full use
of the functionality, taking an input file, creating an output file and comparing
that output file to a template. Those within tests/unit are designed
as more granular unit tests of the logic and behaviour of individual functions.
To see how much of the code is covered by the unit and end-to-end tests, run the following three commands.
export ENV=test
coverage run -m unittest discover tests
coverage report --omit=tests/*
A more detailed way to view the test coverage can be to run the coverage report
in HTML pages. This output will be automatically generated by the
bin/run-test script in the harmony-maskfill/coverage directory.
Alternatively, one can create a coverage directory and run the following commands:
export ENV=test
mkdir -p coverage
coverage run -m unittest discover tests
coverage html --omit=tests/* -d coverage
Then navigate in a web browser to:
file:///full/path/to/harmony-maskfill/coverage/index.html
This should display a page with a table of coverage percentages. Clicking on each file should open a further page that renders the contents of the file, indicating exactly the lines that have coverage, and those that don't.
The unit tests can also be run within a Docker container:
# Build the service image, which is a base image for the test image
./bin/build-image
# Build the test image
./bin/build-test
# Run the tests in a container instance of the test image
./bin/run-testThe terminal should display output from the test results, with the failures
from unittest. Additionally, the XML test reports should be saved to the new
test-reports directory. Test coverage report should also be displayed in the
terminal, and will also be saved to the 'coverage' directory in HTML format.
Coverage reports are being generate for each execution of the GitHub workflow,
and are saved as artefacts.
MaskFill will try to determine the projection information for a variable by using the following metadata (in the order specified):
DIMENSION_LISTattribute. If present, and with units of 'degrees', the data are assumed to be geographic.grid_mappingattribute. If present, this will point to agrid_mappingvariable in the granule. The metadata of that variable is used to define the projection of the variable being filled.- Configuration file. If neither
DIMENSION_LISTnorgrid_mappingare included in the metadata attributes, the configuration file is checked for default values. - If all of the above options do not return information from which a projection can be derived, MaskFill will raise an exception, and the service will fail.
When adding several SMAP collections, new entries were needed for the default grid mapping when input data to MaskFill have not been reprojected. When adding the MaskFill service to a new collection, care should be taken to ensure whether the granule format can provide the necessary grid mapping information.
This repository uses pre-commit to enable pre-commit checking the repository for some coding standard best practices. These include:
- Removing trailing whitespaces.
- Removing blank lines at the end of a file.
- JSON files have valid formats. formatting checks.
To enable these checks:
# Install pre-commit Python package as part of test requirements:
pip install -r tests/pip_test_requirements.txt
# Install the git hook scripts:
pre-commit install
# (Optional) Run against all files:
pre-commit run --all-filesWhen you try to make a new commit locally, pre-commit will automatically run.
If any of the hooks detect non-compliance (e.g., trailing whitespace), that
hook will state it failed, and also try to fix the issue. You will need to
review and git add the changes before you can make a commit.
It is planned to implement additional hooks, possibly including tools such as
mypy.
pre-commit.ci is configured such that these same hooks will be automatically run for every pull request.
You can reach out to the maintainers of this repository via email: