feat: enhance weather attribute normalization to support wind gust data across all API endpoints#546
Conversation
…x retreive of gust attributes
Reviewer's GuideRefactors Netatmo weather attribute normalization into a shared recursive helper used at raw-response extraction time, extends the mapping to cover wind gust attributes, and ensures all API response bodies are normalized consistently before ID fixing and home/module construction. Sequence diagram for normalized Netatmo weather data extraction and device updatesequenceDiagram
actor HomeAssistant
participant NetatmoAPI
participant Helpers_extract_raw_data as Helpers_extract_raw_data
participant Helpers_normalize as Helpers_normalize_weather_attributes
participant Helpers_fix_id as Helpers_fix_id
participant Account_update_devices as Account_update_devices
participant Home
HomeAssistant->>NetatmoAPI: request homestatus or getstationsdata
NetatmoAPI-->>HomeAssistant: resp
HomeAssistant->>Helpers_extract_raw_data: extract_raw_data(resp, tag)
Helpers_extract_raw_data->>Helpers_normalize: normalize_weather_attributes(resp.body)
Helpers_normalize-->>Helpers_extract_raw_data: normalized_body
alt tag is homes
Helpers_extract_raw_data->>Helpers_fix_id: fix_id(normalized_body.homes)
Helpers_fix_id-->>Helpers_extract_raw_data: homes_with_fixed_ids
Helpers_extract_raw_data-->>HomeAssistant: {homes: homes_with_fixed_ids, errors: normalized_body.errors}
else tag is devices or modules
Helpers_extract_raw_data->>Helpers_fix_id: fix_id(normalized_body.tag)
Helpers_fix_id-->>Helpers_extract_raw_data: raw_data_with_fixed_ids
Helpers_extract_raw_data-->>HomeAssistant: {tag: raw_data_with_fixed_ids, errors: normalized_body.errors}
end
HomeAssistant->>Account_update_devices: update_devices(normalized_and_fixed_data)
loop for each device and module
Account_update_devices->>Home: update({home: {modules: [device_data]}})
Home-->>Account_update_devices: updated_state
end
Account_update_devices-->>HomeAssistant: devices_and_homes_updated
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey there - I've reviewed your changes - here's some feedback:
- It looks like modules under the HOME tag are normalized twice (once via normalize_weather_attributes on the full body and again in the HOME-specific modules loop); you can simplify and avoid redundant work by normalizing them only once.
- The ATTRIBUTES_TO_FIX entry for "_id" is effectively unused because "_id" is handled explicitly in normalize_weather_attributes; consider removing it from the mapping to reduce confusion about how IDs are normalized.
- Normalizing the entire "body" for the "body" tag path changes the shape of the public data returned by getpublicdata (e.g., key renames there too); please confirm this broader normalization is intentional for that endpoint and not just for homestatus.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- It looks like modules under the HOME tag are normalized twice (once via normalize_weather_attributes on the full body and again in the HOME-specific modules loop); you can simplify and avoid redundant work by normalizing them only once.
- The ATTRIBUTES_TO_FIX entry for "_id" is effectively unused because "_id" is handled explicitly in normalize_weather_attributes; consider removing it from the mapping to reduce confusion about how IDs are normalized.
- Normalizing the entire "body" for the "body" tag path changes the shape of the public data returned by getpublicdata (e.g., key renames there too); please confirm this broader normalization is intentional for that endpoint and not just for homestatus.
## Individual Comments
### Comment 1
<location> `src/pyatmo/helpers.py:39-59` </location>
<code_context>
+
+ if isinstance(raw_data, dict):
+ normalized: dict[str, Any] = {}
+ has_internal_id = "_id" in raw_data
+ for key, value in raw_data.items():
+ if key == "_id":
+ normalized["_id"] = value
+ if "id" not in raw_data:
+ normalized.setdefault("id", value)
+ continue
+ if key == "dashboard_data" and isinstance(value, dict):
+ normalized.update(normalize_weather_attributes(value))
+ continue
+ normalized[ATTRIBUTES_TO_FIX.get(key, key)] = normalize_weather_attributes(
+ value
+ )
+ if has_internal_id and "id" not in normalized:
+ normalized["id"] = raw_data["_id"]
+ return normalized
</code_context>
<issue_to_address>
**suggestion:** The `_id`/`id` handling is slightly redundant and may be simplified to make the intended precedence clearer.
Inside the dict branch, `_id` is handled in two places: once in the loop (copying `_id` and, if `"id" not in raw_data`, setting `id`), and again after the loop (if `has_internal_id` and `"id" not in normalized`, setting `id` from `raw_data["_id"]`). This duplication makes the `id` precedence harder to follow. Consider consolidating this into a single responsibility (e.g., only in the post-loop block, relying on the loop to set `id` when it already exists) so the priority between user-provided `id` and the derived `_id` is explicit.
```suggestion
def normalize_weather_attributes(raw_data: RawData) -> RawData:
"""Normalize weather-related attributes recursively."""
if isinstance(raw_data, dict):
normalized: dict[str, Any] = {}
for key, value in raw_data.items():
if key == "_id":
# Preserve the internal identifier as-is; `id` will be derived later
normalized["_id"] = value
continue
if key == "dashboard_data" and isinstance(value, dict):
normalized.update(normalize_weather_attributes(value))
continue
normalized[ATTRIBUTES_TO_FIX.get(key, key)] = normalize_weather_attributes(
value
)
# Derive a public `id` from `_id` only when no explicit `id` was provided
if "_id" in normalized and "id" not in normalized:
normalized["id"] = normalized["_id"]
return normalized
```
</issue_to_address>
### Comment 2
<location> `src/pyatmo/helpers.py:97-101` </location>
<code_context>
msg = "No device found, errors in response"
raise NoDeviceError(msg)
+ body = normalize_weather_attributes(resp["body"])
+ if tag == HOME and "modules" in body.get(HOME, {}):
+ body[HOME]["modules"] = [
+ normalize_weather_attributes(module) for module in body[HOME]["modules"]
+ ]
</code_context>
<issue_to_address>
**suggestion:** Modules under `HOME` are normalized twice, which is redundant work and may make future changes harder to reason about.
Because `normalize_weather_attributes` already recurses through nested dicts and lists, normalizing `resp["body"]` will also cover `body[HOME]["modules"]`. The subsequent `if tag == HOME ...` block re-normalizes each module:
```python
body[HOME]["modules"] = [
normalize_weather_attributes(module) for module in body[HOME]["modules"]
]
```
This is redundant and adds an extra code path to maintain. Consider normalizing either at the top level or within the `modules` block, but only once in a single, clearly defined place.
```suggestion
body = normalize_weather_attributes(resp["body"])
```
</issue_to_address>
### Comment 3
<location> `src/pyatmo/helpers.py:42-64` </location>
<code_context>
def normalize_weather_attributes(raw_data: RawData) -> RawData:
"""Normalize weather-related attributes recursively."""
if isinstance(raw_data, dict):
normalized: dict[str, Any] = {}
has_internal_id = "_id" in raw_data
for key, value in raw_data.items():
if key == "_id":
normalized["_id"] = value
if "id" not in raw_data:
normalized.setdefault("id", value)
continue
if key == "dashboard_data" and isinstance(value, dict):
normalized.update(normalize_weather_attributes(value))
continue
normalized[ATTRIBUTES_TO_FIX.get(key, key)] = normalize_weather_attributes(
value
)
if has_internal_id and "id" not in normalized:
normalized["id"] = raw_data["_id"]
return normalized
if isinstance(raw_data, list):
return [normalize_weather_attributes(item) for item in raw_data]
return raw_data
</code_context>
<issue_to_address>
**suggestion (code-quality):** We've found these issues:
- Add guard clause ([`last-if-guard`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/last-if-guard/))
- Lift code into else after jump in control flow ([`reintroduce-else`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/reintroduce-else/))
- Replace if statement with if expression ([`assign-if-exp`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/assign-if-exp/))
- Merge dictionary updates via the union operator ([`dict-assign-update-to-union`](https://docs.sourcery.ai/Reference/Default-Rules/suggestions/dict-assign-update-to-union/))
```suggestion
if not isinstance(raw_data, dict):
return (
[normalize_weather_attributes(item) for item in raw_data]
if isinstance(raw_data, list)
else raw_data
)
normalized: dict[str, Any] = {}
has_internal_id = "_id" in raw_data
for key, value in raw_data.items():
if key == "_id":
normalized["_id"] = value
if "id" not in raw_data:
normalized.setdefault("id", value)
continue
if key == "dashboard_data" and isinstance(value, dict):
normalized |= normalize_weather_attributes(value)
continue
normalized[ATTRIBUTES_TO_FIX.get(key, key)] = normalize_weather_attributes(
value
)
if has_internal_id and "id" not in normalized:
normalized["id"] = raw_data["_id"]
return normalized
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey there - I've reviewed your changes - here's some feedback:
- The new recursive
normalize_weather_attributesnow processes the entireresp['body'], including non-weather sections (e.g.,errorsor other nested metadata); consider constraining recursion to known payload parts (likedashboard_data/modules) to avoid unexpected key renames or structure changes in unrelated data. - The special handling of
_idinnormalize_weather_attributesnow preserves both_idandid, whereas the previous mapping effectively exposed onlyid; please verify whether downstream code expects_idto be absent or that having both keys does not create ambiguity, and update the mapping/logic accordingly if needed. - The function signature of
normalize_weather_attributesstill usesRawDatabut can now return lists as well as dicts; it may be worth tightening the type alias or adding a short comment about the possible return shapes to make its usage clearer at call sites likeextract_raw_dataandhome.update.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The new recursive `normalize_weather_attributes` now processes the entire `resp['body']`, including non-weather sections (e.g., `errors` or other nested metadata); consider constraining recursion to known payload parts (like `dashboard_data`/modules) to avoid unexpected key renames or structure changes in unrelated data.
- The special handling of `_id` in `normalize_weather_attributes` now preserves both `_id` and `id`, whereas the previous mapping effectively exposed only `id`; please verify whether downstream code expects `_id` to be absent or that having both keys does not create ambiguity, and update the mapping/logic accordingly if needed.
- The function signature of `normalize_weather_attributes` still uses `RawData` but can now return lists as well as dicts; it may be worth tightening the type alias or adding a short comment about the possible return shapes to make its usage clearer at call sites like `extract_raw_data` and `home.update`.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
src/pyatmo/helpers.py
Outdated
| msg = "No device found, errors in response" | ||
| raise NoDeviceError(msg) | ||
|
|
||
| body = normalize_weather_attributes(resp["body"]) |
There was a problem hiding this comment.
Why do you normalise attributes here already and then again in the home module?
There was a problem hiding this comment.
Hi! Thanks for reviewing!
As far as I remember, it seems you're right, I made a mistake there ^^
I will remove the second normalization in the home module. I probably made this while testing to make sure it would be normalized in every case, and forgot to remove it. Thanks for noticing! 😄
Do you have others comments on this PR? So I can apply everything in one commit.
There was a problem hiding this comment.
I wonder if the home module would be more appropriate to do the normalisation. But I need to check out the branch and look into it locally.
There was a problem hiding this comment.
Okay so I wait for your feedback on this?
There was a problem hiding this comment.
By all means feel free to investigate yourself as I can't promise fast feedback.
There was a problem hiding this comment.
Pull request overview
This PR refactors weather attribute normalization in the Netatmo integration to support wind gust data from the /homestatus API endpoint. The main changes move the normalization function to a shared helper module and enhance it to work recursively with nested data structures.
- Refactored
normalize_weather_attributes()fromaccount.pytohelpers.pywith recursive processing capabilities - Added mappings for
wind_gustandwind_gust_angleattributes to normalize them togust_strengthandgust_angle - Applied normalization universally across all API endpoints through
extract_raw_data()and to home module updates
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| src/pyatmo/helpers.py | Adds refactored recursive normalize_weather_attributes() function with expanded attribute mappings and applies it to all responses in extract_raw_data() |
| src/pyatmo/account.py | Removes old normalize_weather_attributes() implementation and imports the new version from helpers |
| src/pyatmo/home.py | Applies normalization to modules during home updates to support /homestatus endpoint data |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/pyatmo/helpers.py
Outdated
| def normalize_weather_attributes(raw_data: RawData) -> RawData: | ||
| """Normalize weather-related attributes recursively.""" | ||
|
|
||
| if not isinstance(raw_data, dict): | ||
| return ( | ||
| [normalize_weather_attributes(item) for item in raw_data] | ||
| if isinstance(raw_data, list) | ||
| else raw_data | ||
| ) | ||
| normalized: dict[str, Any] = {} | ||
| has_internal_id = "_id" in raw_data | ||
| for key, value in raw_data.items(): | ||
| if key == "_id": | ||
| normalized["_id"] = value | ||
| if "id" not in raw_data: | ||
| normalized.setdefault("id", value) | ||
| continue | ||
| if key == "dashboard_data" and isinstance(value, dict): | ||
| normalized |= normalize_weather_attributes(value) | ||
| continue | ||
| normalized[ATTRIBUTES_TO_FIX.get(key, key)] = normalize_weather_attributes( | ||
| value | ||
| ) | ||
| if has_internal_id and "id" not in normalized: | ||
| normalized["id"] = raw_data["_id"] | ||
| return normalized |
There was a problem hiding this comment.
The refactored normalize_weather_attributes function introduces significant new logic including recursive processing, special handling of _id/id relationships, and list processing. However, there are no direct unit tests for this function. Given that this function is now used across all API endpoints to normalize weather data, adding comprehensive unit tests would help ensure correctness and prevent regressions. Consider adding tests that cover: dictionary normalization, list normalization, dashboard_data flattening, _id to id conversion, and the new wind_gust mappings.
src/pyatmo/helpers.py
Outdated
| msg = "No device found, errors in response" | ||
| raise NoDeviceError(msg) | ||
|
|
||
| body = normalize_weather_attributes(resp["body"]) |
There was a problem hiding this comment.
Applying normalization to the entire response body at the extraction level is a significant change in scope. Previously, normalize_weather_attributes was only called selectively in specific parts of the codebase. Now it's applied universally to all data flowing through extract_raw_data. While this ensures consistency across all API endpoints, it also means that code throughout the codebase that wasn't previously receiving normalized data will now get it. This could potentially break assumptions in existing code. Consider verifying that all consumers of extract_raw_data can handle the normalized data format, especially the recursive normalization and the _id/id handling changes.
src/pyatmo/helpers.py
Outdated
| def normalize_weather_attributes(raw_data: RawData) -> RawData: | ||
| """Normalize weather-related attributes recursively.""" | ||
|
|
||
| if not isinstance(raw_data, dict): | ||
| return ( | ||
| [normalize_weather_attributes(item) for item in raw_data] | ||
| if isinstance(raw_data, list) | ||
| else raw_data | ||
| ) | ||
| normalized: dict[str, Any] = {} | ||
| has_internal_id = "_id" in raw_data | ||
| for key, value in raw_data.items(): | ||
| if key == "_id": | ||
| normalized["_id"] = value | ||
| if "id" not in raw_data: | ||
| normalized.setdefault("id", value) | ||
| continue | ||
| if key == "dashboard_data" and isinstance(value, dict): | ||
| normalized |= normalize_weather_attributes(value) | ||
| continue | ||
| normalized[ATTRIBUTES_TO_FIX.get(key, key)] = normalize_weather_attributes( | ||
| value | ||
| ) | ||
| if has_internal_id and "id" not in normalized: | ||
| normalized["id"] = raw_data["_id"] | ||
| return normalized |
There was a problem hiding this comment.
The refactored function introduces two significant behavioral changes from the original implementation: (1) It now recursively processes nested dictionaries and lists, whereas the old version only recursively handled dashboard_data. (2) The handling of "_id" has changed - the old code had "_id": "id" in the mapping which would replace _id with id, but the new code preserves both _id and id keys. These changes could impact existing code that depends on the previous behavior. For example, code at line 280 in account.py accesses module["_id"] which would work with the new behavior but may have issues with fully understanding the data flow. Consider documenting these behavioral changes and ensuring all call sites are compatible with the new behavior.
src/pyatmo/helpers.py
Outdated
| has_internal_id = "_id" in raw_data | ||
| for key, value in raw_data.items(): | ||
| if key == "_id": | ||
| normalized["_id"] = value | ||
| if "id" not in raw_data: | ||
| normalized.setdefault("id", value) | ||
| continue | ||
| if key == "dashboard_data" and isinstance(value, dict): | ||
| normalized |= normalize_weather_attributes(value) | ||
| continue | ||
| normalized[ATTRIBUTES_TO_FIX.get(key, key)] = normalize_weather_attributes( | ||
| value | ||
| ) | ||
| if has_internal_id and "id" not in normalized: | ||
| normalized["id"] = raw_data["_id"] |
There was a problem hiding this comment.
The logic for handling "_id" and "id" is duplicated and confusing. Lines 49-53 handle the case when processing "_id" as a key, setting both "_id" and potentially "id". Then lines 60-61 check again if "id" needs to be set from "_id". This creates redundant checks and makes the code harder to understand. Consider simplifying this logic by handling the "_id" to "id" transformation in a single, clear location.
|
why did copilot reviewed it I didn't asked for it to do so, I was just syncing my fork with the latest commits 😂 |
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- Consider narrowing where
normalize_weather_attributesis applied inextract_raw_data, as normalizing the entirebody(including non-weather structures likeerrors) may unintentionally rewrite fields that happen to match keys inATTRIBUTES_TO_FIXacross all response types. - When flattening
dashboard_datavianormalized |= normalize_weather_attributes(value), any key collisions with existing top-level fields in the same object will silently overwrite values; if that’s not desired, it may be safer to keep the dashboard data namespaced or resolve collisions explicitly.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider narrowing where `normalize_weather_attributes` is applied in `extract_raw_data`, as normalizing the entire `body` (including non-weather structures like `errors`) may unintentionally rewrite fields that happen to match keys in `ATTRIBUTES_TO_FIX` across all response types.
- When flattening `dashboard_data` via `normalized |= normalize_weather_attributes(value)`, any key collisions with existing top-level fields in the same object will silently overwrite values; if that’s not desired, it may be safer to keep the dashboard data namespaced or resolve collisions explicitly.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
@cgtobi I applied the sourcery-ai suggestions and your feedback on the c6cb1c1 commit. I choose to remove it from the home-module, since it guarantees a certain "consistency" between all the modules by normalizing directly on Also, what do you think about the last sourcery-ai analysis? I haven't seen any "risks" normalizing the whole body based on what the API returns.
|
|
It was basically my concern as well. But for now I'd say as long as the tests pass it should be okay. |
|
Great! I don't know if you had time to look at the fix I made on this issue. But while I have your attention on it, I just need to have more infos, especially about the cooldown applied when a call fails. Also, if this is merged, will it directly be available through a release? That's because I need this fix to be pushed to a latest release, so I can make a PR on ha-core repository. |
Only once we release a new version. Currently this is not fully automated. |
|
@tiimjcb please ensure the typechecker does not complain |
|
@cgtobi all tests including mypy typecheck are passing locally now 👍🏼 |
| } | ||
|
|
||
|
|
||
| def normalize_weather_attributes(raw_data: NormalizableData) -> NormalizableData: |
There was a problem hiding this comment.
Are you sure about the return type?
There was a problem hiding this comment.
I thought making a return type would satisfy mypy. Otherwise, it would block since the return type wasn't explicitly specified, and "any" would also block it. At least in local.
There was a problem hiding this comment.
What type does normalize_weather_attributes return?
There was a problem hiding this comment.
From this logic, it returns the same type that it received in input. If we call normalize_weather_attributes with a dict in input, it returns a dict. If it's a list in input, it returns a list.
That's why I declare the type NormalizableData, so mypy is satisfied with the different possibilities.
There was a problem hiding this comment.
Take a look at the original function and compare how you changed the types and the internal behaviour.
There was a problem hiding this comment.
| def normalize_weather_attributes(raw_data: NormalizableData) -> NormalizableData: | |
| T = TypeVar('T') | |
| def normalize_weather_attributes(raw_data: T) -> T: |
Also the normalize function needs to be split so that it uses your extended version when calling from within.
| normalized["_id"] = value | ||
| continue | ||
| if key == "dashboard_data" and isinstance(value, dict): | ||
| # useless cast here to satisfy mypy |
There was a problem hiding this comment.
This hints at some type mismatch. A cast should not be necessary.
There was a problem hiding this comment.
I agree with you that the cast is completly unnecessary. But as said, it's only to satisfy mypy. Without this cast, it blocks. I don't know what else to do in that case 😬
I'll check if there seems to be a sort of mismatch in the types.
There was a problem hiding this comment.
Checked it again, and I really don't see how I can do it in another way. From what I understand, it seems that mypy expects an explicit type because it needs to guarantee that the passed object (the result of normalize_weather_attributes) is of the expected type (dict). But since that function can return different types, I placed a cast there to explicitly tell mypy that, in this specific context, the type is indeed a dict, even though it's completely unnecessary 🤷🏼♂️
| msg = "No device found, errors in response" | ||
| raise NoDeviceError(msg) | ||
|
|
||
| # useless cast here again to satisfy mypy |
There was a problem hiding this comment.
This hints at some type mismatch. A cast should not be necessary.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Context
For context, I'm working on a PR on the
ha-corerepository (Home Assistant), on the Netatmo component. The initial issue was that retrieving weather infos with/getstationdatais not responsive enough, since data is aggregated every 10mn. However, the/homestatusendpoint gives almost live data (according to Netatmo).So I made modifications in
ha-core, everything was working except for the gust attributes. The reason? The names of the attributes are different with the/homestatusroute :Main changes
Refactored
normalize_weather_attributes():I moved from
account.pytohelpers.pyfor a better code organization.Enhanced it to work recursively with the nested dictionaries and lists
Added support for additional attributes in the mapping
Added support for the wind_gust attributes, so the mapping works like it should be with the
/homestatusendpointApplied normalization across all data retrieval paths
Applied the normalization process to all responses types. For the HOME endpoint, I made sure modules within home data are normalized.
The tests are passing. And of course, it fixes my issue : wind_gust, and wind_gust_angle are retrieved and updated every time I use the
/homestatusendpoint.@cgtobi : I tried to reach you on Discord to have your opinion about the way I'd like to implement this on Home Assistant. Basically, it's just using the homestatus endpoint, but I'd like to directly talk about it with you.
Thanks guys! Have a great day :)
Summary by Sourcery
Normalize Netatmo weather data consistently across responses and extend support for wind gust attributes.
New Features:
Enhancements: