perf(lint): Rewrite check-line-endings in Python for ~37x speedup#8399
Conversation
Replace the sequential bash script (11.2s) with a parallel Python implementation (0.3s). Each file is read once in Python with no subprocesses spawned per file — all checks (binary detection, CRLF, trailing whitespace, missing newline) run in a single pass. Files are processed concurrently via ThreadPoolExecutor. Empty files are intentionally skipped (same as the bash script). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Yuri Shkuro <github@ysh.us>
There was a problem hiding this comment.
Pull request overview
This PR replaces the existing shell-based line-ending linter/fixer with a parallel Python implementation, and wires the new script into the existing make fmt / make lint workflow to significantly reduce runtime.
Changes:
- Remove
scripts/lint/check-line-endings.shand addscripts/lint/check-line-endings.pywith a threaded, single-pass checker/fixer. - Update
Makefiletargets to call the new Python implementation for both lint and fix modes.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| scripts/lint/check-line-endings.sh | Removes the previous bash implementation of the line-endings linter/fixer. |
| scripts/lint/check-line-endings.py | Adds a parallel Python implementation to detect/fix CRLF, trailing whitespace, and missing final newline. |
| Makefile | Switches fmt and lint-line-endings targets to use the Python script. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #8399 +/- ##
==========================================
+ Coverage 96.27% 96.30% +0.02%
==========================================
Files 320 320
Lines 16924 16924
==========================================
+ Hits 16294 16298 +4
+ Misses 472 469 -3
+ Partials 158 157 -1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Add error handling to get_tracked_files(): raise SystemExit with a clear message if 'git' is not on PATH or 'git ls-files' fails, instead of silently succeeding with an empty file list. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Yuri Shkuro <github@ysh.us>
Without check=True a failing git invocation would silently return empty stdout, causing the linter to succeed without checking any files. Uncaught exceptions from subprocess are sufficient for all other error paths (git not on PATH, etc.). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Yuri Shkuro <github@ysh.us>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| try: | ||
| content = path.read_bytes() | ||
| except OSError: | ||
| return [] |
There was a problem hiding this comment.
process_file() silently skips files it cannot read (except OSError: return []). This can cause the lint to incorrectly succeed while ignoring a tracked file (e.g., permission issue, transient IO error), whereas the previous shell implementation would fail fast due to set -e. Consider treating read failures as lint errors (or at least emitting a warning and failing) so issues don’t get masked.
Replace the sequential bash script (11.2s) with a parallel Python implementation (0.3s). Each file is read once in Python with no subprocesses spawned per file — all checks (binary detection, CRLF, trailing whitespace, missing newline) run in a single pass. Files are processed concurrently via ThreadPoolExecutor. Empty files are intentionally skipped (same as the bash script).
AI Usage in this PR (choose one)
See AI Usage Policy.