Skip to content

Conversation

@janbuchar
Copy link
Collaborator

@tlinhart any chance you could test this and confirm it fixes the issue?

@janbuchar janbuchar added the t-tooling Issues with this label are in the ownership of the tooling team. label Oct 8, 2024
@github-actions github-actions bot added this to the 100th sprint - Tooling team milestone Oct 8, 2024
@janbuchar janbuchar changed the title fix: workaround json value type error fix: Workaround for JSON value typing problems Oct 8, 2024
@tlinhart
Copy link

tlinhart commented Oct 8, 2024

@janbuchar it solves the first issue i.e. this:

request.user_data["item"] = item

However the second issue remains:

item = context.request.user_data["item"]
item["results"] = context.selector.xpath("normalize-space(//form//strong[1])").get()  <-- type error remains

This is what Pylance reports:

Metoda __setitem__ není u typu str definovánaPylance[reportIndexIssue](https://github.com/microsoft/pyright/blob/main/docs/configuration.md#reportIndexIssue)
Metoda __setitem__ není u typu bool definovánaPylance[reportIndexIssue](https://github.com/microsoft/pyright/blob/main/docs/configuration.md#reportIndexIssue)
Metoda __setitem__ není u typu int definovánaPylance[reportIndexIssue](https://github.com/microsoft/pyright/blob/main/docs/configuration.md#reportIndexIssue)
Metoda __setitem__ není u typu float definovánaPylance[reportIndexIssue](https://github.com/microsoft/pyright/blob/main/docs/configuration.md#reportIndexIssue)
Objekt typu “None“ nelze zadat jako dolní indexPylance[reportOptionalSubscript](https://github.com/microsoft/pyright/blob/main/docs/configuration.md#reportOptionalSubscript)
Zadaným argumentům neodpovídají žádná přetížení pro __setitem__Pylance[reportCallIssue](https://github.com/microsoft/pyright/blob/main/docs/configuration.md#reportCallIssue)
builtins.pyi(1032, 9): Přetížení 1 je nejbližší shoda.
Argument typu Literal['results'] není možné přiřadit k parametru key typu SupportsIndex ve funkci __setitem__
  Literal['results'] není kompatibilní s protokolem SupportsIndex
    __index__ není k dispoziciPylance[reportArgumentType](https://github.com/microsoft/pyright/blob/main/docs/configuration.md#reportArgumentType)
(variable) item: JsonSerializable[Unknown]

Mypy outputs this:

crawlee-user_data/main.py:30: error: Unsupported target for indexed assignment ("list[Any] | dict[str, Any] | str | bool | int | float | None")  [index]
crawlee-user_data/main.py:30: error: No overload variant of "__setitem__" of "list" matches argument types "str", "str | None"  [call-overload]
crawlee-user_data/main.py:30: note: Possible overload variants:
crawlee-user_data/main.py:30: note:     def __setitem__(self, SupportsIndex, Any, /) -> None
crawlee-user_data/main.py:30: note:     def __setitem__(self, slice, Iterable[Any], /) -> None
Found 2 errors in 1 file (checked 1 source file)

When I try to specify a type for the item in

item: dict[str, str | None] = context.request.user_data["item"]

I get this type error from Pylance on the same line:

Typ JsonSerializable[Unknown] se nedá přiřadit k deklarovanému typu dict[str, str | None].
  Typ JsonSerializable[Unknown] se nedá přiřadit k typu dict[str, str | None].
    float se nedá přiřadit k dict[str, str | None].

@janbuchar
Copy link
Collaborator Author

However the second issue remains:

item = context.request.user_data["item"]
item["results"] = context.selector.xpath("normalize-space(//form//strong[1])").get()  <-- type error remains

This seems correct though - user_data["item"] can be pretty much anything, and you're accessing it in a different function than the one you set it in, so there's really no way for the type checker to deduce the type of that item.

When I try to specify a type for the item in

item: dict[str, str | None] = context.request.user_data["item"]

This way, you say that you expect item to have the type dict[str, str | None], but the type checker still makes sure that what you're assigning conforms to that. To convince it, you need to do this:

from typing import cast
# ...
item = cast(dict[str, str | None], context.request.user_data["item"])

This way, you tell the type checker that you know what the type is and take full responsibility if you're wrong 🙂

I tried this locally with the relevant part of your code snippet and it seems to work. Did I miss anything?

@tlinhart
Copy link

tlinhart commented Oct 8, 2024

Did I miss anything?

No, I think you are perfectly correct ;-) Thanks!

Copy link
Collaborator

@vdusek vdusek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job, thanks

@janbuchar janbuchar merged commit 403496a into master Oct 8, 2024
20 checks passed
@janbuchar janbuchar deleted the workaround-json-value-type-error branch October 8, 2024 11:08
deshansh pushed a commit to deshansh/crawlee-python that referenced this pull request Oct 19, 2024
- closes apify#563

@tlinhart any chance you could test this and confirm it fixes the issue?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

t-tooling Issues with this label are in the ownership of the tooling team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Correct / recommended way of using user_data

4 participants