add a backend for running codespell as a linter by cburroughs · Pull Request #22989 · pantsbuild/pants

cburroughs · 2026-01-08T20:07:03Z

To quote from https://github.com/codespell-project/codespell: "Fix common misspellings in text files. It's designed primarily for checking misspelled words in source code". The intent is to have few enough false positives that it could be used as a linter. When I run it at a $DAYJOB repo it picks all sorts of embarrassments like:

lastest ==> latest, last
worfklow ==> workflow
imapct ==> impact
Nmber ==> Number

LLM Disclosure: I tried to code this as an experiment in learning hard on Claude to understand Pants backends, the
vast majority of the code was generated by Claude which I then had it iterated on before minor cleanup edits. My first prompt was:

> We are going to create a new backend for https://pypi.org/project/codespell  in the Pants build system under
src/python/pants/backend/tools/backend.  Look at these SHAs for examples of adding new backends
f6e51c2873d51df2b63853a0b8db13b4e94292f3
cb63bba66817677a1dcb862c150e6fc7ca9f96dd
9465d3d75091d7ca44bbfc492c09e3c6d418a4e8

I went back a forth a bunch on partitioning strategies. It seemed to me that what users expect is to have multiple config files and things Just Work -- albeit with uncertainty with regards to the expected behavior being "use the nearest config file" or "magically merge them". So I went with per config partitioning, but leaned in the process that most backends use a single partition.

partition_inputs is long and hard to follow...
But! it is almost identical to yamllint I think we are lacking in a good abstraction for config based partitioning and if we had one.
codespell uses different flags based on the format of the config file, which adds some more incidental conditionals.
But on the third hand, I don't think a human would bother with supporting this complicated a partitioning strategy just to check some words.

This is real backend I'd like to land and use it, but I'm neutral on which partitioning strategy is best to keep.

cburroughs · 2026-01-08T20:08:08Z

src/python/pants/backend/tools/codespell/rules.py

+
+
+class CodespellRequest(LintFilesRequest):
+    tool_subsystem = Codespell  # type: ignore[assignment]


We do this at a bunch of places

src/python/pants/backend/adhoc/code_quality_tool.py: class CodeQualityProcessingRequest(LintFilesRequest): src/python/pants/backend/adhoc/code_quality_tool.py- tool_subsystem = CodeQualityToolInstance # type: ignore[assignment] -- src/python/pants/backend/project_info/regex_lint.py:class RegexLintRequest(LintFilesRequest): src/python/pants/backend/project_info/regex_lint.py- tool_subsystem = RegexLintSubsystem # type: ignore[assignment] -- src/python/pants/backend/tools/codespell/rules.py:class CodespellRequest(LintFilesRequest): src/python/pants/backend/tools/codespell/rules.py- tool_subsystem = Codespell # type: ignore[assignment] -- src/python/pants/backend/tools/trufflehog/rules.py:class TrufflehogRequest(LintFilesRequest): src/python/pants/backend/tools/trufflehog/rules.py- tool_subsystem = Trufflehog # type: ignore[assignment] -- src/python/pants/backend/tools/yamllint/rules.py:class YamllintRequest(LintFilesRequest): src/python/pants/backend/tools/yamllint/rules.py- tool_subsystem = Yamllint # type: ignore[assignment]

cburroughs · 2026-01-08T20:14:42Z

What this currently looks like in this repo: https://gist.github.com/cburroughs/30e3752899a39b769abe44b5435092be

cburroughs · 2026-01-10T02:08:59Z

#21937 has an example of sophisticated partitioning

cburroughs · 2026-01-29T16:24:37Z

Thanks for the contribution. We've just branched for 2.31.x, so merging this pull request now will come out in 2.32.x, please move the release notes updates to docs/notes/2.32.x.md if that's appropriate.

To quote from <https://github.com/codespell-project/codespell>: "Fix common misspellings in text files. It's designed primarily for checking misspelled words in source code". The intent is to have few enough false positives that it could be used as a linter. When I run it at a $DAYJOB repo it picks all sorts of embarrassments like: ``` lastest ==> latest, last worfklow ==> workflow imapct ==> impact Nmber ==> Number ``` LLM Disclosure: I tried to code this as an experiment in learning hard on Claude to understand Pants backends, the vast majority of the code was generated by Claude which I then had it iterated on before minor cleanup edits. My first prompt was: ``` > We are going to create a new backend for https://pypi.org/project/codespell in the Pants build system under src/python/pants/backend/tools/backend. Look at these SHAs for examples of adding new backends f6e51c2 cb63bba 9465d3d ``` I went back a forth a bunch on partitioning strategies. It seemed to me that what users expect is to have multiple config files and things Just Work -- albeit with uncertainty with regards to the expected behavior being "use the nearest config file" or "magically merge them". So I went with per config partitioning, but leaned in the process that most backends use a single partition. * partition_inputs is long and hard to follow... * But! it is almost identical to yamllint I think we are lacking in a good abstraction for config based partitioning and if we had one. * codespell uses different flags based on the format of the config file, which adds some more incidental conditionals. * But on the third hand, I don't think a human would bother with supporting this complicated a partitioning strategy just to check some words. So this is real code and I'd like to land and use it, but I'm neutral on which partitioning strategy is best to keep.

cburroughs self-assigned this Jan 8, 2026

cburroughs commented Jan 8, 2026

View reviewed changes

cburroughs marked this pull request as ready for review January 8, 2026 20:58

cburroughs changed the title ~~add a backend for running codespell a linter~~ add a backend for running codespell as a linter Jan 21, 2026

cburroughs added 4 commits January 30, 2026 18:18

empty

7414c5f

really empty

719faf7

ver move

f7c9c5e

cburroughs force-pushed the csb/codespell branch from b8f8cf8 to f7c9c5e Compare January 31, 2026 00:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add a backend for running codespell as a linter#22989

add a backend for running codespell as a linter#22989
cburroughs wants to merge 4 commits intopantsbuild:mainfrom
cburroughs:csb/codespell

cburroughs commented Jan 8, 2026

Uh oh!

cburroughs Jan 8, 2026

Uh oh!

cburroughs commented Jan 8, 2026

Uh oh!

cburroughs commented Jan 10, 2026

Uh oh!

cburroughs commented Jan 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant



		class CodespellRequest(LintFilesRequest):
		tool_subsystem = Codespell # type: ignore[assignment]

Uh oh!

Conversation

cburroughs commented Jan 8, 2026

Uh oh!

cburroughs Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

cburroughs commented Jan 8, 2026

Uh oh!

cburroughs commented Jan 10, 2026

Uh oh!

cburroughs commented Jan 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant