Skip to content
Merged
Show file tree
Hide file tree
Changes from 87 commits
Commits
Show all changes
114 commits
Select commit Hold shift + click to select a range
b894d25
Disallow untyped defs in namedarray
Illviljan Sep 27, 2023
895c2cb
Just use strict instead
Illviljan Sep 27, 2023
15a2a8b
Update pyproject.toml
Illviljan Sep 27, 2023
89c8fea
Test explicit list instead.
Illviljan Sep 27, 2023
4e10650
Update pyproject.toml
Illviljan Sep 27, 2023
fc7f69a
Update pyproject.toml
Illviljan Sep 27, 2023
43f4e20
Update pyproject.toml
Illviljan Sep 27, 2023
e6147d3
Merge branch 'main' into disallow_untyped_defs
Illviljan Sep 28, 2023
b84b1bc
Merge branch 'main' into disallow_untyped_defs
Illviljan Sep 28, 2023
7cb3a09
Update utils.py
Illviljan Sep 28, 2023
4a83aa4
Update core.py
Illviljan Sep 28, 2023
7ad7634
getmaskarray isn't typed yet
Illviljan Sep 28, 2023
cebd1eb
Update core.py
Illviljan Sep 28, 2023
71a942b
add _Array protocol
Illviljan Sep 28, 2023
4cdbed5
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 28, 2023
c80ff30
Update utils.py
Illviljan Sep 28, 2023
2e84c31
Merge branch 'disallow_untyped_defs' of https://github.com/Illviljan/…
Illviljan Sep 28, 2023
f6c6f44
Update utils.py
Illviljan Sep 28, 2023
4b897eb
Update utils.py
Illviljan Sep 28, 2023
7a5cb43
Merge branch 'main' into disallow_untyped_defs
Illviljan Sep 28, 2023
7081ee1
Update utils.py
Illviljan Sep 28, 2023
87958d9
Update utils.py
Illviljan Sep 28, 2023
f913206
Update utils.py
Illviljan Sep 28, 2023
3b0c122
Update utils.py
Illviljan Sep 28, 2023
027f300
Update utils.py
Illviljan Sep 28, 2023
23ec9fe
Update utils.py
Illviljan Sep 28, 2023
d94b766
Update utils.py
Illviljan Sep 28, 2023
c353336
Update utils.py
Illviljan Sep 28, 2023
5e26eba
Update test_namedarray.py
Illviljan Sep 28, 2023
c5a9594
Update utils.py
Illviljan Sep 28, 2023
ecb50c0
Update test_namedarray.py
Illviljan Sep 28, 2023
9ab9dae
Update test_namedarray.py
Illviljan Sep 28, 2023
19b3304
Merge branch 'main' into disallow_untyped_defs
Illviljan Sep 29, 2023
41bd67c
Update utils.py
Illviljan Sep 29, 2023
5f58cee
Update utils.py
Illviljan Sep 30, 2023
7f1a94e
Update utils.py
Illviljan Sep 30, 2023
685ca7c
Update core.py
Illviljan Sep 30, 2023
7aa2f57
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 30, 2023
99b0aca
Update utils.py
Illviljan Sep 30, 2023
84b6894
Update core.py
Illviljan Sep 30, 2023
08d11ef
Update test_namedarray.py
Illviljan Sep 30, 2023
196a5c6
Update test_namedarray.py
Illviljan Sep 30, 2023
07e3085
Update test_namedarray.py
Illviljan Sep 30, 2023
707f244
Update utils.py
Illviljan Sep 30, 2023
a3901bc
Update core.py
Illviljan Sep 30, 2023
df13d47
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 30, 2023
f76aeb1
Update core.py
Illviljan Sep 30, 2023
cce278c
Merge branch 'disallow_untyped_defs' of https://github.com/Illviljan/…
Illviljan Sep 30, 2023
3865264
Update core.py
Illviljan Sep 30, 2023
b61d9a8
Update utils.py
Illviljan Sep 30, 2023
4dec3ca
Update core.py
Illviljan Oct 1, 2023
26ac902
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 1, 2023
9d23245
Update core.py
Illviljan Oct 1, 2023
1f1a25d
Merge branch 'disallow_untyped_defs' of https://github.com/Illviljan/…
Illviljan Oct 1, 2023
d8007e8
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 1, 2023
762e808
Merge branch 'main' into disallow_untyped_defs
Illviljan Oct 1, 2023
459b38a
Merge branch 'disallow_untyped_defs' of https://github.com/Illviljan/…
Illviljan Oct 1, 2023
1f93f5f
Update core.py
Illviljan Oct 1, 2023
b2570dd
Update test_namedarray.py
Illviljan Oct 1, 2023
6835c09
Update utils.py
Illviljan Oct 1, 2023
1bac4af
Update pyproject.toml
Illviljan Oct 1, 2023
5d72861
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 1, 2023
ebf4752
Update core.py
Illviljan Oct 1, 2023
6c8fac9
Update utils.py
Illviljan Oct 1, 2023
ee49c5e
Update xarray/namedarray/utils.py
Illviljan Oct 1, 2023
5de4142
Update utils.py
Illviljan Oct 1, 2023
99bf8aa
Update core.py
Illviljan Oct 1, 2023
fa41cbe
Merge branch 'disallow_untyped_defs' of https://github.com/Illviljan/…
Illviljan Oct 1, 2023
b27145e
Update utils.py
Illviljan Oct 1, 2023
7f262d5
Update core.py
Illviljan Oct 1, 2023
401a93a
Update utils.py
Illviljan Oct 1, 2023
bcda5a4
Update core.py
Illviljan Oct 1, 2023
2535a5f
Update core.py
Illviljan Oct 1, 2023
2c5b49d
Update core.py
Illviljan Oct 1, 2023
9d29827
Update test_namedarray.py
Illviljan Oct 1, 2023
2fba5a9
Update utils.py
Illviljan Oct 1, 2023
80842ea
Update core.py
Illviljan Oct 1, 2023
cf8d5cc
Update utils.py
Illviljan Oct 1, 2023
32439fe
Update test_namedarray.py
Illviljan Oct 1, 2023
946bd3d
Update test_namedarray.py
Illviljan Oct 1, 2023
130c894
Update core.py
Illviljan Oct 1, 2023
e0064b9
Update parallel.py
Illviljan Oct 1, 2023
fec9f1b
Update utils.py
Illviljan Oct 1, 2023
877f0f1
fixes
Illviljan Oct 1, 2023
a5eddb1
Update utils.py
Illviljan Oct 1, 2023
2194715
Update utils.py
Illviljan Oct 1, 2023
77e05f2
ignores
Illviljan Oct 1, 2023
1348df6
Update xarray/namedarray/utils.py
Illviljan Oct 2, 2023
025e9cc
Update xarray/namedarray/utils.py
Illviljan Oct 2, 2023
2305216
Update core.py
Illviljan Oct 2, 2023
13c8953
Update test_namedarray.py
Illviljan Oct 2, 2023
5559548
Update core.py
Illviljan Oct 2, 2023
a177ce7
Update core.py
Illviljan Oct 2, 2023
ca7ee37
Update core.py
Illviljan Oct 2, 2023
ce77930
Update core.py
Illviljan Oct 2, 2023
99f6c9b
Update test_namedarray.py
Illviljan Oct 2, 2023
0fa4fd3
Update core.py
Illviljan Oct 2, 2023
5b98dd5
Update test_namedarray.py
Illviljan Oct 2, 2023
56a7755
import specific type functions
Illviljan Oct 2, 2023
b321a84
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 2, 2023
6a33331
Update core.py
Illviljan Oct 2, 2023
863ed1d
Update core.py
Illviljan Oct 2, 2023
11b36fa
Update core.py
Illviljan Oct 2, 2023
2bd6f8c
Try chunkedarray instead
Illviljan Oct 2, 2023
476dda2
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 2, 2023
27c18b8
fixes
Illviljan Oct 2, 2023
931659f
Merge branch 'disallow_untyped_defs' of https://github.com/Illviljan/…
Illviljan Oct 2, 2023
c2a1fb7
Update core.py
Illviljan Oct 2, 2023
892e83d
Update core.py
Illviljan Oct 2, 2023
58266c4
Update core.py
Illviljan Oct 2, 2023
00cba3d
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 2, 2023
114c45c
Update core.py
Illviljan Oct 2, 2023
c414c0a
Merge branch 'disallow_untyped_defs' of https://github.com/Illviljan/…
Illviljan Oct 2, 2023
ec9c173
Merge branch 'main' into disallow_untyped_defs
andersy005 Oct 3, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 36 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,7 @@ module = [
"cftime.*",
"cubed.*",
"cupy.*",
"dask.types.*",
"fsspec.*",
"h5netcdf.*",
"h5py.*",
Expand Down Expand Up @@ -162,6 +163,41 @@ module = [
"xarray.tests.test_weighted",
]

# Use strict = true whenever namedarray has become standalone. In the meantime
# don't forget to add all new files related to namedarray here:
# ref: https://mypy.readthedocs.io/en/stable/existing_code.html#introduce-stricter-options
[[tool.mypy.overrides]]
# Start off with these
warn_unused_configs = true
warn_redundant_casts = true
warn_unused_ignores = true

# Getting these passing should be easy
strict_equality = true
strict_concatenate = true

# Strongly recommend enabling this one as soon as you can
check_untyped_defs = true

# These shouldn't be too much additional work, but may be tricky to
# get passing if you use a lot of untyped libraries
disallow_subclassing_any = true
disallow_untyped_decorators = true
disallow_any_generics = true

# These next few are various gradations of forcing use of type annotations
disallow_untyped_calls = true
disallow_incomplete_defs = true
disallow_untyped_defs = true

# This one isn't too hard to get passing, but return on investment is lower
no_implicit_reexport = true

# This one can be tricky to get passing if you use a lot of untyped libraries
warn_return_any = true

module = ["xarray.namedarray.*", "xarray.tests.test_namedarray"]

[tool.ruff]
builtins = ["ellipsis"]
exclude = [
Expand Down
2 changes: 1 addition & 1 deletion xarray/core/parallel.py
Original file line number Diff line number Diff line change
Expand Up @@ -443,7 +443,7 @@ def subset_dataset_to_block(
for dim in variable.dims:
chunk = chunk[chunk_index[dim]]

chunk_variable_task = (f"{name}-{gname}-{chunk[0]}",) + chunk_tuple
chunk_variable_task = (f"{name}-{gname}-{chunk[0]!r}",) + chunk_tuple
graph[chunk_variable_task] = (
tuple,
[variable.dims, chunk, variable.attrs],
Expand Down
160 changes: 109 additions & 51 deletions xarray/namedarray/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@

import copy
import math
import sys
import typing
from collections.abc import Hashable, Iterable, Mapping

Expand All @@ -11,30 +10,38 @@
# TODO: get rid of this after migrating this class to array API
from xarray.core import dtypes
from xarray.core.indexing import ExplicitlyIndexed
from xarray.core.utils import Default, _default
from xarray.namedarray.utils import (
Default,
T_DuckArray,
_default,
is_chunked_duck_array,
is_duck_array,
is_duck_dask_array,
to_0d_object_array,
)

if typing.TYPE_CHECKING:
T_NamedArray = typing.TypeVar("T_NamedArray", bound="NamedArray")
from xarray.namedarray.utils import Self # type: ignore[attr-defined]

try:
from dask.typing import (
Graph,
NestedKeys,
PostComputeCallable,
PostPersistCallable,
SchedulerGetCallable,
)
except ImportError:
Graph: typing.Any # type: ignore[no-redef]
NestedKeys: typing.Any # type: ignore[no-redef]
SchedulerGetCallable: typing.Any # type: ignore[no-redef]
PostComputeCallable: typing.Any # type: ignore[no-redef]
PostPersistCallable: typing.Any # type: ignore[no-redef]

# T_NamedArray = typing.TypeVar("T_NamedArray", bound="NamedArray")
DimsInput = typing.Union[str, Iterable[Hashable]]
Dims = tuple[Hashable, ...]


try:
if sys.version_info >= (3, 11):
from typing import Self
else:
from typing_extensions import Self
except ImportError:
if typing.TYPE_CHECKING:
raise
else:
Self: typing.Any = None
AttrsInput = typing.Union[Mapping[typing.Any, typing.Any], None]


# TODO: Add tests!
Expand All @@ -46,7 +53,7 @@ def as_compatible_data(
return typing.cast(T_DuckArray, data)

if isinstance(data, np.ma.MaskedArray):
mask = np.ma.getmaskarray(data)
mask = np.ma.getmaskarray(data) # type: ignore[no-untyped-call]
if mask.any():
# TODO: requires refactoring/vendoring xarray.core.dtypes and xarray.core.duck_array_ops
raise NotImplementedError("MaskedArray is not supported yet")
Expand Down Expand Up @@ -74,13 +81,17 @@ class NamedArray(typing.Generic[T_DuckArray]):
Numeric operations on this object implement array broadcasting and dimension alignment based on dimension names,
rather than axis order."""

__slots__ = ("_dims", "_data", "_attrs")
__slots__ = ("_data", "_dims", "_attrs")

_data: T_DuckArray
_dims: Dims
_attrs: dict[typing.Any, typing.Any] | None

def __init__(
self,
dims: DimsInput,
data: T_DuckArray | np.typing.ArrayLike,
attrs: dict | None = None,
attrs: AttrsInput = None,
fastpath: bool = False,
):
"""
Expand All @@ -105,9 +116,9 @@ def __init__(


"""
self._data: T_DuckArray = as_compatible_data(data, fastpath=fastpath)
self._dims: Dims = self._parse_dimensions(dims)
self._attrs: dict | None = dict(attrs) if attrs else None
self._data = as_compatible_data(data, fastpath=fastpath)
self._dims = self._parse_dimensions(dims)
self._attrs = dict(attrs) if attrs else None

@property
def ndim(self) -> int:
Expand Down Expand Up @@ -140,7 +151,7 @@ def __len__(self) -> int:
raise TypeError("len() of unsized object") from exc

@property
def dtype(self) -> np.dtype:
def dtype(self) -> np.dtype[typing.Any]:
"""
Data-type of the array’s elements.

Expand Down Expand Up @@ -178,7 +189,7 @@ def nbytes(self) -> int:
the bytes consumed based on the ``size`` and ``dtype``.
"""
if hasattr(self._data, "nbytes"):
return self._data.nbytes
return self._data.nbytes # type: ignore[no-any-return]
else:
return self.size * self.dtype.itemsize

Expand Down Expand Up @@ -208,7 +219,7 @@ def attrs(self) -> dict[typing.Any, typing.Any]:
return self._attrs

@attrs.setter
def attrs(self, value: Mapping) -> None:
def attrs(self, value: Mapping[typing.Any, typing.Any]) -> None:
self._attrs = dict(value)

def _check_shape(self, new_data: T_DuckArray) -> None:
Expand Down Expand Up @@ -256,43 +267,78 @@ def imag(self) -> Self:
"""
return self._replace(data=self.data.imag)

def __dask_tokenize__(self):
# Use v.data, instead of v._data, in order to cope with the wrappers
# around NetCDF and the like
from dask.base import normalize_token
def __dask_tokenize__(self) -> Hashable | None:
if is_duck_dask_array(self._data):
# Use v.data, instead of v._data, in order to cope with the wrappers
# around NetCDF and the like
from dask.base import normalize_token

return normalize_token((type(self), self._dims, self.data, self.attrs))
s, d, a, attrs = type(self), self._dims, self.data, self.attrs
return normalize_token((s, d, a, attrs)) # type: ignore[no-any-return]
else:
return None

def __dask_graph__(self):
return self._data.__dask_graph__() if is_duck_dask_array(self._data) else None
def __dask_graph__(self) -> Graph | None:
if is_duck_dask_array(self._data):
return self._data.__dask_graph__()
else:
# TODO: Should this method just raise instead?
# raise NotImplementedError("Method requires self.data to be a dask array")
return None

def __dask_keys__(self):
return self._data.__dask_keys__()
def __dask_keys__(self) -> NestedKeys:
if is_duck_dask_array(self._data):
return self._data.__dask_keys__()
else:
raise AttributeError("Method requires self.data to be a dask array.")

def __dask_layers__(self):
return self._data.__dask_layers__()
def __dask_layers__(self) -> typing.Sequence[str]:
if is_duck_dask_array(self._data):
return self._data.__dask_layers__()
else:
raise AttributeError("Method requires self.data to be a dask array.")

@property
def __dask_optimize__(self) -> typing.Callable:
return self._data.__dask_optimize__
def __dask_optimize__(
self,
) -> typing.Callable[..., dict[typing.Any, typing.Any]]:
if is_duck_dask_array(self._data):
return self._data.__dask_optimize__() # type: ignore[no-any-return]
else:
raise AttributeError("Method requires self.data to be a dask array.")

@property
def __dask_scheduler__(self) -> typing.Callable:
return self._data.__dask_scheduler__
def __dask_scheduler__(self) -> staticmethod[SchedulerGetCallable]:
if is_duck_dask_array(self._data):
return self._data.__dask_scheduler__() # type: ignore[no-any-return]
else:
raise AttributeError("Method requires self.data to be a dask array.")

def __dask_postcompute__(
self,
) -> tuple[typing.Callable, tuple[typing.Any, ...]]:
array_func, array_args = self._data.__dask_postcompute__()
return self._dask_finalize, (array_func,) + array_args
) -> tuple[PostComputeCallable, tuple[typing.Any, ...]]:
if is_duck_dask_array(self._data):
array_func, array_args = self._data.__dask_postcompute__() # type: ignore[no-untyped-call]
return self._dask_finalize, (array_func,) + array_args
else:
raise AttributeError("Method requires self.data to be a dask array.")

def __dask_postpersist__(
self,
) -> tuple[typing.Callable, tuple[typing.Any, ...]]:
array_func, array_args = self._data.__dask_postpersist__()
return self._dask_finalize, (array_func,) + array_args
) -> tuple[PostPersistCallable, tuple[typing.Any, ...]]:
if is_duck_dask_array(self._data):
array_func, array_args = self._data.__dask_postpersist__() # type: ignore[no-untyped-call]
return self._dask_finalize, (array_func,) + array_args
else:
raise AttributeError("Method requires self.data to be a dask array.")

def _dask_finalize(self, results, array_func, *args, **kwargs) -> Self:
def _dask_finalize(
self,
results: T_DuckArray,
array_func: typing.Callable[..., T_DuckArray],
*args: typing.Any,
**kwargs: typing.Any,
) -> Self:
data = array_func(results, *args, **kwargs)
return type(self)(self._dims, data, attrs=self._attrs)

Expand All @@ -308,7 +354,13 @@ def chunks(self) -> tuple[tuple[int, ...], ...] | None:
NamedArray.chunksizes
xarray.unify_chunks
"""
return getattr(self._data, "chunks", None)
data = self._data
# reveal_type(data)
if is_chunked_duck_array(data):
# reveal_type(data)
return data.chunks
else:
return None
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

        data = self._data
        reveal_type(data)  #  note: Revealed type is "T_DuckArray`1"
        if is_chunked_duck_array(data):
            reveal_type(data)  #  note: Revealed type is "<nothing>"
            return data.chunks # error: <nothing> has no attribute "chunks"  [attr-defined]
        else:
            return None

@headtr1ck Do you (or anyone else) understand why data isn't narrowed down to T_ChunkedArray?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure actually.

But the "`1" in the first reveal_type usually is already an indication that something is not correct. Even though I have never figured out why Mypy does this and what it means (somehow the type is not exactly known at this time or part of a Union or something like that).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somehow I like the getattr implementation better than the explicit check... it's just much easier to read and understand.

Still your issue indicates that something is wrong somewhere...


@property
def chunksizes(
Expand All @@ -328,8 +380,9 @@ def chunksizes(
NamedArray.chunks
xarray.unify_chunks
"""
if hasattr(self._data, "chunks"):
return dict(zip(self.dims, self.data.chunks))
data = self._data
if is_chunked_duck_array(data):
return dict(zip(self.dims, data.chunks))
else:
return {}

Expand All @@ -338,7 +391,12 @@ def sizes(self) -> dict[Hashable, int]:
"""Ordered mapping from dimension names to lengths."""
return dict(zip(self.dims, self.shape))

def _replace(self, dims=_default, data=_default, attrs=_default) -> Self:
def _replace(
self,
dims: DimsInput | Default = _default,
data: T_DuckArray | np.typing.ArrayLike | Default = _default,
attrs: AttrsInput | Default = _default,
) -> Self:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the duck array type differs from Self the generic part of the return type should change as well.

Probably need a second TypeVar for this.

if dims is _default:
dims = copy.copy(self._dims)
if data is _default:
Expand Down Expand Up @@ -415,7 +473,7 @@ def _nonzero(self) -> tuple[Self, ...]:
def _as_sparse(
self,
sparse_format: str | Default = _default,
fill_value=dtypes.NA,
fill_value: typing.Any = dtypes.NA,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type of this is wrong since the underlying duck array is now sparse.

Same with _to_dense

) -> Self:
"""
use sparse-array as backend.
Expand Down
Loading