Skip to content

perf(lint): Rewrite check-line-endings in Python for ~37x speedup#8399

Merged
yurishkuro merged 3 commits into
jaegertracing:mainfrom
yurishkuro:line-ending-linter-to-python
Apr 16, 2026
Merged

perf(lint): Rewrite check-line-endings in Python for ~37x speedup#8399
yurishkuro merged 3 commits into
jaegertracing:mainfrom
yurishkuro:line-ending-linter-to-python

Conversation

@yurishkuro
Copy link
Copy Markdown
Member

Replace the sequential bash script (11.2s) with a parallel Python implementation (0.3s). Each file is read once in Python with no subprocesses spawned per file — all checks (binary detection, CRLF, trailing whitespace, missing newline) run in a single pass. Files are processed concurrently via ThreadPoolExecutor. Empty files are intentionally skipped (same as the bash script).

AI Usage in this PR (choose one)

See AI Usage Policy.

  • None: No AI tools were used in creating this PR
  • Light: AI provided minor assistance (formatting, simple suggestions)
  • Moderate: AI helped with code generation or debugging specific parts
  • Heavy: AI generated most or all of the code changes

Replace the sequential bash script (11.2s) with a parallel Python
implementation (0.3s). Each file is read once in Python with no
subprocesses spawned per file — all checks (binary detection, CRLF,
trailing whitespace, missing newline) run in a single pass.
Files are processed concurrently via ThreadPoolExecutor.
Empty files are intentionally skipped (same as the bash script).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Yuri Shkuro <github@ysh.us>
@yurishkuro yurishkuro added the changelog:ci Change related to continuous integration / testing label Apr 16, 2026
@yurishkuro yurishkuro requested a review from a team as a code owner April 16, 2026 19:44
Copilot AI review requested due to automatic review settings April 16, 2026 19:44
@dosubot dosubot Bot added the performance label Apr 16, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR replaces the existing shell-based line-ending linter/fixer with a parallel Python implementation, and wires the new script into the existing make fmt / make lint workflow to significantly reduce runtime.

Changes:

  • Remove scripts/lint/check-line-endings.sh and add scripts/lint/check-line-endings.py with a threaded, single-pass checker/fixer.
  • Update Makefile targets to call the new Python implementation for both lint and fix modes.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
scripts/lint/check-line-endings.sh Removes the previous bash implementation of the line-endings linter/fixer.
scripts/lint/check-line-endings.py Adds a parallel Python implementation to detect/fix CRLF, trailing whitespace, and missing final newline.
Makefile Switches fmt and lint-line-endings targets to use the Python script.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread scripts/lint/check-line-endings.py Outdated
Comment thread scripts/lint/check-line-endings.py
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 16, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.30%. Comparing base (639134a) to head (a63fa7d).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8399      +/-   ##
==========================================
+ Coverage   96.27%   96.30%   +0.02%     
==========================================
  Files         320      320              
  Lines       16924    16924              
==========================================
+ Hits        16294    16298       +4     
+ Misses        472      469       -3     
+ Partials      158      157       -1     
Flag Coverage Δ
badger_direct 9.16% <ø> (ø)
badger_e2e 1.06% <ø> (ø)
cassandra-4.x-direct-manual 13.36% <ø> (ø)
cassandra-4.x-e2e-auto 1.05% <ø> (ø)
cassandra-4.x-e2e-manual 1.05% <ø> (ø)
cassandra-5.x-direct-manual 13.36% <ø> (ø)
cassandra-5.x-e2e-auto 1.05% <ø> (ø)
cassandra-5.x-e2e-manual 1.05% <ø> (ø)
clickhouse-direct 9.12% <ø> (ø)
clickhouse-e2e 1.18% <ø> (ø)
elasticsearch-6.x-direct 17.21% <ø> (ø)
elasticsearch-7.x-direct 17.24% <ø> (ø)
elasticsearch-8.x-direct 17.39% <ø> (ø)
elasticsearch-8.x-e2e 1.06% <ø> (ø)
elasticsearch-9.x-e2e 1.06% <ø> (ø)
grpc_direct 8.04% <ø> (ø)
grpc_e2e 1.06% <ø> (ø)
kafka-3.x-v2 1.06% <ø> (ø)
memory_v2 1.06% <ø> (ø)
opensearch-1.x-direct 17.29% <ø> (ø)
opensearch-2.x-direct 17.29% <ø> (ø)
opensearch-2.x-e2e 1.06% <ø> (ø)
opensearch-3.x-e2e 1.06% <ø> (ø)
query 1.06% <ø> (ø)
tailsampling-processor 0.53% <ø> (ø)
unittests 94.54% <ø> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Add error handling to get_tracked_files(): raise SystemExit with a
clear message if 'git' is not on PATH or 'git ls-files' fails, instead
of silently succeeding with an empty file list.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Yuri Shkuro <github@ysh.us>
Without check=True a failing git invocation would silently return
empty stdout, causing the linter to succeed without checking any files.
Uncaught exceptions from subprocess are sufficient for all other error
paths (git not on PATH, etc.).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Yuri Shkuro <github@ysh.us>
Copilot AI review requested due to automatic review settings April 16, 2026 19:59
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread scripts/lint/check-line-endings.py
Comment on lines +38 to +41
try:
content = path.read_bytes()
except OSError:
return []
Copy link

Copilot AI Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

process_file() silently skips files it cannot read (except OSError: return []). This can cause the lint to incorrectly succeed while ignoring a tracked file (e.g., permission issue, transient IO error), whereas the previous shell implementation would fail fast due to set -e. Consider treating read failures as lint errors (or at least emitting a warning and failing) so issues don’t get masked.

Copilot uses AI. Check for mistakes.
Comment thread scripts/lint/check-line-endings.py
@yurishkuro yurishkuro merged commit d639945 into jaegertracing:main Apr 16, 2026
78 checks passed
@yurishkuro yurishkuro deleted the line-ending-linter-to-python branch April 16, 2026 20:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog:ci Change related to continuous integration / testing performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants