-
Notifications
You must be signed in to change notification settings - Fork 31.3k
VSCode pylance auto-completion for HfArgumentParser (limited support)
#27275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VSCode pylance auto-completion for HfArgumentParser (limited support)
#27275
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
| import yaml | ||
|
|
||
|
|
||
| DataClass = NewType("DataClass", Any) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any idea why these NewType()s were used originally instead of a simple alias? Like DataClass = Any, DataclassType = Any.
| args_filename=None, | ||
| args_file_flag=None, | ||
| ) -> Tuple[DataClass, ...]: | ||
| ) -> Tuple[T, ...]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me this looks the same as annotating the return type to Tuple[Any, ...], since T isn't used anywhere else.
The usual use of TypeVar is to infer an output type from either an input type (if T is used as annotation for an argument, like identity(x: T) -> T) or from a concretized version of a generic class (like __getitem__() return T for a list[T]).
I have some ideas for suggestions that I can leave in a comment!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for the comment. I think the most ideal workflow is something like below working, but not sure if it's possible 👀
class HfArgumentParser:
def __init__(self, test: List[Type[T]]):
self.test = test
def parse_args_into_dataclasses(self) -> List[T]:
return self.test
parser = HfArgumentParser([RewardConfig, Config2])
args, args2 = parser.parse_args_into_dataclasses()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first option I listed below should do this exactly! The trick is that HfArgumentParser needs to inherit from Generic[T].
|
Hello! As some alternative suggestions, (1) This implementation gets us: (2) As an FYI, import dataclasses
import tyro
@dataclasses.dataclass
class TrainArgs:
lr: float = 3e-4
@dataclasses.dataclass
class RewardConfig:
weight: float = 0.01
# This currently prefixes arguments with `--0.` for TrainArgs, and `--1.` for RewardConfig. Could be made configurable.
train, reward = tyro.cli(tuple[TrainArgs, RewardConfig])A similar API in HfArgumentParser could make the types much cleaner, but comes with all of the obvious downsides of a breaking change. Footnotes
|
|
@brentyi thanks so much for the detailed suggestions! I personally like "(2) typing.overload can also help" and set a maximum to something like 10, but the code will look quite hacky... Would like some input from |
|
@vwxyzjn @brentyi Thanks for opening this PR and the work on improving the codebase! As a general note, the type annotations in the library are not intended to be complete or fully compatible with type checkers e.g. running mypy will throw a bunch of errors. They are there as general documentation to guide the user. So, in this case, |
|
Hi @amyeroberts thanks for the comment! I agree that type annotation do not necessarily need to work with mypy. The main thing I was thinking is the developer experience with auto-completion. With @brentyi's option 1, we would still get the descriptive API like beforebut by default it's unable to recognize which dataclass type it is, so for usage like
With @brentyi's option 2Our internal code would become uglier like but the user will have a more seamless experience (pylance would correctly recognize
Would you be in favor of either options? 1st option would require the least amount of change of course. Both options are non-breaking and just empower pylance's auto-completion in different ways. |
|
As a user of the library I'd personally appreciate the option 2 approach, it's nice when completions work! But I also agree it adds maintenance burden. If folks would prefer to avoid DataClass = NewType("DataClass", Any)
DataClassType = NewType("DataClassType", Any)for DataClass = Any
DataClassType = Any # or type / typing.TypeThis will make the assertions in the PR description useful for correct autocompletion; |
|
I completely understand the motivation behind this. However adding additional code for type checking (in particular overloads) is something which has been proposed and rejected before. The primary reason for this is that we don't formally support type checking and we don't want to add additional code we need to maintain in order to support it. For example: this comment and releated PR, or this comment. |
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |




Problem description
This PR empowers limited support of VSCode auto-completion for
HfArgumentParser. Currently, the return type ofparse_args_into_dataclassesisDataClass = NewType("DataClass", Any), which limits pylance's ability to infer types even if we specifically assert a dataclass type fo the args.What does this PR do?
This PR modifies the return type to be
List[TypeVar("T")]. So when we assert the args to be of a certain type (i.e.,assert isinstance(args, RewardConfig)), auto-completion works as expected.Alternatives considered
Some
argparselibraries such as tyro can automatically infer the type of the args, but it doesn't seem to work with the currentHfArgumentParserparadigm for two reasons:args2: RewardConfig | Config2, so we can't parse multiple dataclasses inparse_args_into_dataclassesand infer type correctly.parse_args_into_dataclasses([RewardConfig, Config2])instead ofparse_args_into_dataclasses()Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
CC @muellerzr, @pacman100, @lewtun, @brentyi