-
Notifications
You must be signed in to change notification settings - Fork 22
Open
Description
- xradar version: 0.1.0
Description
We should take a look at how we can speed up the xarray backends, and if there are more levels of parallelization possible.
I wonder if upstream enhancements of xarray
pydata/xarray#7437
Might help with this, enabling us to plug in the io directly/benefit from more parallelization here.
What I Did
I read the data the following code:
import xarray as xr
import xradar as xd
import numpy as np
def fix_angle(ds):
"""
Aligns the radar volumes
"""
ds["time"] = ds.time.load() # Convert time from dask to numpy
start_ang = 0 # Set consistent start/end values
stop_ang = 360
# Find the median angle resolution
angle_res = 0.5
# Determine whether the radar is spinning clockwise or counterclockwise
median_diff = ds.azimuth.diff("time").median()
ascending = median_diff > 0
direction = 1 if ascending else -1
# first find exact duplicates and remove
ds = xd.util.remove_duplicate_rays(ds)
# second reindex according to retrieved parameters
ds = xd.util.reindex_angle(
ds, start_ang, stop_ang, angle_res, direction, method="nearest"
)
ds = ds.expand_dims("volume_time") # Expand for volumes for concatenation
ds["volume_time"] = [np.nanmin(ds.time.values)]
return ds
ds = xr.open_mfdataset(
files,
preprocess=fix_angle,
engine="cfradial1",
group="sweep_0",
concat_dim="volume_time",
combine="nested",
chunks={'range':250},
)Which resulted in this task graph, where the green is the open_dataset function.

Which has quite a bit of whitespace/could use some optimization.
Metadata
Metadata
Assignees
Labels
No labels