add a backend for running codespell as a linter#22989
Open
cburroughs wants to merge 4 commits intopantsbuild:mainfrom
Open
add a backend for running codespell as a linter#22989cburroughs wants to merge 4 commits intopantsbuild:mainfrom
cburroughs wants to merge 4 commits intopantsbuild:mainfrom
Conversation
cburroughs
commented
Jan 8, 2026
|
|
||
|
|
||
| class CodespellRequest(LintFilesRequest): | ||
| tool_subsystem = Codespell # type: ignore[assignment] |
Contributor
Author
There was a problem hiding this comment.
We do this at a bunch of places
src/python/pants/backend/adhoc/code_quality_tool.py: class CodeQualityProcessingRequest(LintFilesRequest):
src/python/pants/backend/adhoc/code_quality_tool.py- tool_subsystem = CodeQualityToolInstance # type: ignore[assignment]
--
src/python/pants/backend/project_info/regex_lint.py:class RegexLintRequest(LintFilesRequest):
src/python/pants/backend/project_info/regex_lint.py- tool_subsystem = RegexLintSubsystem # type: ignore[assignment]
--
src/python/pants/backend/tools/codespell/rules.py:class CodespellRequest(LintFilesRequest):
src/python/pants/backend/tools/codespell/rules.py- tool_subsystem = Codespell # type: ignore[assignment]
--
src/python/pants/backend/tools/trufflehog/rules.py:class TrufflehogRequest(LintFilesRequest):
src/python/pants/backend/tools/trufflehog/rules.py- tool_subsystem = Trufflehog # type: ignore[assignment]
--
src/python/pants/backend/tools/yamllint/rules.py:class YamllintRequest(LintFilesRequest):
src/python/pants/backend/tools/yamllint/rules.py- tool_subsystem = Yamllint # type: ignore[assignment]
Contributor
Author
|
What this currently looks like in this repo: https://gist.github.com/cburroughs/30e3752899a39b769abe44b5435092be |
Contributor
Author
|
#21937 has an example of sophisticated partitioning |
Contributor
Author
|
Thanks for the contribution. We've just branched for 2.31.x, so merging this pull request now will come out in 2.32.x, please move the release notes updates to docs/notes/2.32.x.md if that's appropriate. |
To quote from <https://github.com/codespell-project/codespell>: "Fix common misspellings in text files. It's designed primarily for checking misspelled words in source code". The intent is to have few enough false positives that it could be used as a linter. When I run it at a $DAYJOB repo it picks all sorts of embarrassments like: ``` lastest ==> latest, last worfklow ==> workflow imapct ==> impact Nmber ==> Number ``` LLM Disclosure: I tried to code this as an experiment in learning hard on Claude to understand Pants backends, the vast majority of the code was generated by Claude which I then had it iterated on before minor cleanup edits. My first prompt was: ``` > We are going to create a new backend for https://pypi.org/project/codespell in the Pants build system under src/python/pants/backend/tools/backend. Look at these SHAs for examples of adding new backends f6e51c2 cb63bba 9465d3d ``` I went back a forth a bunch on partitioning strategies. It seemed to me that what users expect is to have multiple config files and things Just Work -- albeit with uncertainty with regards to the expected behavior being "use the nearest config file" or "magically merge them". So I went with per config partitioning, but leaned in the process that most backends use a single partition. * partition_inputs is long and hard to follow... * But! it is almost identical to yamllint I think we are lacking in a good abstraction for config based partitioning and if we had one. * codespell uses different flags based on the format of the config file, which adds some more incidental conditionals. * But on the third hand, I don't think a human would bother with supporting this complicated a partitioning strategy just to check some words. So this is real code and I'd like to land and use it, but I'm neutral on which partitioning strategy is best to keep.
b8f8cf8 to
f7c9c5e
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
To quote from https://github.com/codespell-project/codespell: "Fix common misspellings in text files. It's designed primarily for checking misspelled words in source code". The intent is to have few enough false positives that it could be used as a linter. When I run it at a $DAYJOB repo it picks all sorts of embarrassments like:
LLM Disclosure: I tried to code this as an experiment in learning hard on Claude to understand Pants backends, the
vast majority of the code was generated by Claude which I then had it iterated on before minor cleanup edits. My first prompt was:
I went back a forth a bunch on partitioning strategies. It seemed to me that what users expect is to have multiple config files and things Just Work -- albeit with uncertainty with regards to the expected behavior being "use the nearest config file" or "magically merge them". So I went with per config partitioning, but leaned in the process that most backends use a single partition.
This is real backend I'd like to land and use it, but I'm neutral on which partitioning strategy is best to keep.