-
Notifications
You must be signed in to change notification settings - Fork 235
POC: Wrap GMT_Read_Data and read datasets/grids/images into GMT data container #3318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
c66965e to
316390b
Compare
| def read_data( | ||
| self, | ||
| family: str, | ||
| geometry: str, | ||
| mode: str, | ||
| wesn: Sequence[float] | None, | ||
| infile: str, | ||
| data=None, | ||
| ): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The definition of the method follows the definition of the API function GMT_Read_Data. We need to call it like this:
lib.read_data("GMT_IS_DATASET", "GMT_IS_PLP", "GMT_READ_NORMAL", None, infile, None)
lib.read_data("GMT_IS_GRID", "GMT_IS_SURFACE", "GMT_READ_NORMAL", None, infile, None)
lib.read_data("GMT_IS_IMAGE", "GMT_IS_SURFACE", "GMT_READ_NORMAL", None, infile, None)
but they look really boring and weird.
I prefer to refactor the function definition like this:
lib.read_data(infile, kind="dataset")
lib.read_data(infile, kind="grid")
lib.read_data(infile, kind="image")
similar to the syntax of the gmt read infile outfile -Tc|d|g|i|p|u syntax of the special read/write module. For reference, the read module mainly calls the gmt_copy function and the gmt_copy function calls GMT_Read_Data with different arguments based on the data family.
Anyway, this should be discussed in more detail later.
| @contextlib.contextmanager | ||
| def virtualfile_out( | ||
| self, kind: Literal["dataset", "grid"] = "dataset", fname: str | None = None | ||
| self, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes below are copied from #3128 so can be ignored when reviewing this PR.
|
|
||
| from pygmt.datatypes.dataset import _GMT_DATASET | ||
| from pygmt.datatypes.grid import _GMT_GRID | ||
| from pygmt.datatypes.image import _GMT_IMAGE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copied from #3128
| """ | ||
| attrs: dict[str, Any] = {} | ||
| attrs["Conventions"] = "CF-1.7" | ||
| if self.type == 18: # Grid file format: ns = GMT netCDF format |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copied from #3128
| @@ -0,0 +1,182 @@ | |||
| """ | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copied from #3128 and can be ignored when reviwing this PR.
| assert data[3][4] == 250.0 | ||
|
|
||
|
|
||
| def test_clib_read_data_grid_two_steps(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GMT_CONTAINER_ONLY & GMT_DATA_ONLY: Read the grid header first and then read the grid data.
| assert data[3][4] == 250.0 | ||
|
|
||
|
|
||
| def test_clib_read_data_image_as_grid(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Read an image @earth_day_01d_p into a GMT_GRID container. The header.n_bands=1, so only the first band is read.
| assert image.data | ||
|
|
||
|
|
||
| def test_clib_read_data_image(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Read the @earth_day_01d_p file into the GMT_IMAGE container. Note that header.n_bands=3!
| assert image.data | ||
|
|
||
|
|
||
| def test_clib_read_data_image_two_steps(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two-step reading for images.
| assert image.data is not None # The data is read | ||
|
|
||
|
|
||
| def test_clib_read_data_grid_as_image(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Read a grid @earth_relief_01d_p into a GMT_IMAGE container. header.n_bands=1.
Tests test_clib_read_data_image and test_clib_read_data_grid_as_image tell us that, we can call lib.read_data with the same arguments to read either a grid or an image, and then we can determine the data kind based on n_bands.
Description of proposed changes
This PR wraps the GMT API function GMT_Read_Data.
Currently, this PR contains 4 commits:
GMT_Read_Datafunction_GMT_DATASET/_GMT_GRID_GMT_IMAGE(with minor fixes)All new tests pass and the most important thing I learned is, that for an input grid/image, we can always read it into a GMT_IMAGE container. For grid,
header.n_bands=1and for image,header.n_bands=3(or any other values). TheGMT_Read_DataAPI function can read a grid/image in either one step or two steps. Here "two steps" means reading the header first and then reading the data. Reading the header only is very efficient (a few milliseconds) even for huge grid/image files. Thus, we can read the grid/image header in and checkheader.n_bandsto determine the input data kind, which addresses the concerns in ##3115 (comment).So, the next steps are:
GMT_Read_Datawrapper and the tests for datasets/grids and finalize the wrapper Wrap the GMT API function GMT_Read_Data to read data into GMT data containers #3324GMT_IMAGEwrapper (line 1-71 ofpygmt/datatypes/image.pyin PR GMT_IMAGE: Implement the GMT_IMAGE.to_dataarray method for 3-band images #3128, even without any doctests), and another PR with the remaining WIP codes. Then review/merge the first PR and work on the 2nd PR in the future.I plan to work on the above steps in separate PRs and keep this PR unchanged so that we can trace back to this PR in the future.