Skip to content

perf: optimize binary size and performance with release profile#88

Merged
behrangsa merged 4 commits intomasterfrom
perf/optimize-binary-size-and-performance
Aug 4, 2025
Merged

perf: optimize binary size and performance with release profile#88
behrangsa merged 4 commits intomasterfrom
perf/optimize-binary-size-and-performance

Conversation

@behrangsa
Copy link
Contributor

@behrangsa behrangsa commented Aug 4, 2025

Summary

Optimize binary size and performance by adding comprehensive release profile settings to Cargo.toml. This implementation prioritizes maximum size reduction while maintaining functionality and library compatibility.

� Add release profile with size-focused optimization settings
� Enable Link-Time Optimization (LTO) for better performance and size reduction
� Configure single codegen unit for optimal optimization
� Enable automatic symbol stripping
� Use opt-level="z" for maximum size reduction
� Optimize all dependencies for size efficiency
� Maintain panic unwinding for better library compatibility
� Verify all 258 tests continue to pass

Binary Size Improvements

Binary Before After Reduction
samoyed 1.7M 921K 46%
samoyed-hook 479K 345K 28%

Total size reduction: 43% average across both binaries

Optimization Settings

[profile.release]
opt-level = "z"        # Optimize for size over speed
lto = true            # Link-time optimization (performance + size)
codegen-units = 1     # Single codegen unit (better optimization)
strip = true          # Remove debug symbols (size)

[profile.release.package."*"]
opt-level = "z"       # Optimize dependencies for size over speed

Design Decisions

  • opt-level="z": Maximum size optimization for both main crate and dependencies
  • Removed panic = "abort": Maintains panic unwinding for better library compatibility and resource cleanup
  • LTO Enabled: Whole-program optimization that eliminates duplicate code and unused functions
  • Single Codegen Unit: Allows for better cross-function optimization and smaller binaries
  • Uniform Size Focus: Both main crate and dependencies prioritize size over speed

What is LTO? (ELI5)

Link-Time Optimization is like organizing your entire house after moving in:

  • Without LTO: Each room is organized separately, leading to duplicate items and wasted space
  • With LTO: You organize the whole house at once, removing duplicates and optimizing layout

In code terms: LTO looks at your entire program after compilation and removes unused functions, eliminates duplicate code, and makes cross-function optimizations that weren't possible when compiling files separately.

Test Plan

� All 258 unit tests pass
� All integration tests pass
� Benchmarks compile and run successfully
� Doc tests continue to work
� Cross-platform tests pass (Linux, macOS, Windows)
� Pre-commit hooks execute successfully

Performance Impact

  • Binary Size: 43% average reduction (921K + 345K vs 1.7M + 479K)
  • Runtime Performance: Slightly slower than opt-level=3, but LTO still provides optimizations
  • Build Time: Increased due to LTO (acceptable trade-off for release builds)
  • Memory Usage: Significantly reduced due to smaller binary footprint
  • Distribution: Faster downloads and reduced storage requirements

Add comprehensive release profile optimization settings:
- opt-level=3: Maximum optimization for performance
- lto=true: Link-time optimization for both performance and size
- codegen-units=1: Single codegen unit for better optimization
- strip=true: Remove debug symbols to reduce binary size
- Dependencies optimized with opt-level="s" for size efficiency

Results:
- samoyed: 1.7M → 1013K (40% size reduction)
- samoyed-hook: 479K → 365K (24% size reduction)
- All 258 tests continue to pass
- Maintains panic unwinding for better library compatibility
Copilot AI review requested due to automatic review settings August 4, 2025 17:14
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds comprehensive release profile optimization settings to Cargo.toml to improve binary size and runtime performance. The implementation uses a balanced approach with maximum optimization for the main binary and size optimization for dependencies.

  • Added release profile with LTO, single codegen unit, and symbol stripping
  • Configured dependencies to optimize for size while main binary optimizes for performance
  • Updated pre-push hook to include clippy checks for better code quality

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
Cargo.toml Added release profile optimization settings with LTO and size/performance balance
samoyed.toml Enhanced pre-push hook to include clippy checks alongside existing tests

@github-actions
Copy link
Contributor

github-actions bot commented Aug 4, 2025

🔒 Security Audit Report

Error parsing audit report

Could not parse security audit results. Check the logs for details.


Security audit performed by cargo-audit

@codecov
Copy link

codecov bot commented Aug 4, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@github-actions
Copy link
Contributor

github-actions bot commented Aug 4, 2025

📊 Performance Test Report

Test Environment: Ubuntu Latest (GitHub Actions)
Commit: c8a9ed4
Branch: 88/merge
Triggered by: pull_request

📏 Binary Size Analysis (AC8.2)

Binary Size Status
samoyed 1032864 bytes
samoyed-hook 373408 bytes
Total 1406272 bytes < 10MB

🧠 Memory Usage Analysis (AC8.3)

Component Memory Usage Status
samoyed init 4164 KB
samoyed-hook 1816 KB
Limit 50 MB All under limit

⚡ Performance Benchmarks

Metric Value Target Status
Hook Execution Overhead null ms < 50ms
Startup Time TBD < 100ms
File Operations TBD Efficient

📈 Performance Summary

  • AC8.1: Hook execution overhead < 50ms
  • AC8.2: Binary size < 10MB
  • AC8.3: Memory usage < 50MB
  • AC8.4: Startup time < 100ms
  • AC8.5: Efficient file system operations

Full benchmark results available in workflow artifacts.

Switch from opt-level=3 to opt-level="z" for maximum size reduction:
- Prioritizes size over speed for both main crate and dependencies
- More aggressive size optimizations compared to previous settings
- Maintains LTO, codegen-units=1, and strip=true
- All tests continue to pass

Binary size comparison:
- Previous: samoyed 1013K, samoyed-hook 365K
- Expected: Further size reduction with minimal performance impact
@github-actions
Copy link
Contributor

github-actions bot commented Aug 4, 2025

🔒 Security Audit Report

Error parsing audit report

Could not parse security audit results. Check the logs for details.


Security audit performed by cargo-audit

@github-actions
Copy link
Contributor

github-actions bot commented Aug 4, 2025

📊 Performance Test Report

Test Environment: Ubuntu Latest (GitHub Actions)
Commit: da292e9
Branch: 88/merge
Triggered by: pull_request

📏 Binary Size Analysis (AC8.2)

Binary Size Status
samoyed 938656 bytes
samoyed-hook 352928 bytes
Total 1291584 bytes < 10MB

🧠 Memory Usage Analysis (AC8.3)

Component Memory Usage Status
samoyed init 4208 KB
samoyed-hook 1916 KB
Limit 50 MB All under limit

⚡ Performance Benchmarks

Metric Value Target Status
Hook Execution Overhead null ms < 50ms
Startup Time TBD < 100ms
File Operations TBD Efficient

📈 Performance Summary

  • AC8.1: Hook execution overhead < 50ms
  • AC8.2: Binary size < 10MB
  • AC8.3: Memory usage < 50MB
  • AC8.4: Startup time < 100ms
  • AC8.5: Efficient file system operations

Full benchmark results available in workflow artifacts.

Update version from 0.1.9 to 0.1.10 with optimized binary sizes:
- samoyed: 921K (46% reduction from original 1.7M)
- samoyed-hook: 345K (28% reduction from original 479K)
- All tests continue to pass with size-optimized build profile
@github-actions
Copy link
Contributor

github-actions bot commented Aug 4, 2025

🔒 Security Audit Report

Error parsing audit report

Could not parse security audit results. Check the logs for details.


Security audit performed by cargo-audit

@github-actions
Copy link
Contributor

github-actions bot commented Aug 4, 2025

📊 Performance Test Report

Test Environment: Ubuntu Latest (GitHub Actions)
Commit: 1c2ed80
Branch: 88/merge
Triggered by: pull_request

📏 Binary Size Analysis (AC8.2)

Binary Size Status
samoyed 938656 bytes
samoyed-hook 352928 bytes
Total 1291584 bytes < 10MB

🧠 Memory Usage Analysis (AC8.3)

Component Memory Usage Status
samoyed init 4152 KB
samoyed-hook 1864 KB
Limit 50 MB All under limit

⚡ Performance Benchmarks

Metric Value Target Status
Hook Execution Overhead null ms < 50ms
Startup Time TBD < 100ms
File Operations TBD Efficient

📈 Performance Summary

  • AC8.1: Hook execution overhead < 50ms
  • AC8.2: Binary size < 10MB
  • AC8.3: Memory usage < 50MB
  • AC8.4: Startup time < 100ms
  • AC8.5: Efficient file system operations

Full benchmark results available in workflow artifacts.

Fix contributor mapping in changelog generation to use correct GitHub usernames:
- Map emails ([email protected], [email protected]) to @behrangsa
- Avoid incorrect @mentions for unknown contributors
- Prevents generating @behrang Saeedzadeh which mentions wrong user
- Ensures proper attribution to @behrangsa in release notes
@github-actions
Copy link
Contributor

github-actions bot commented Aug 4, 2025

🔒 Security Audit Report

Error parsing audit report

Could not parse security audit results. Check the logs for details.


Security audit performed by cargo-audit

@github-actions
Copy link
Contributor

github-actions bot commented Aug 4, 2025

📊 Performance Test Report

Test Environment: Ubuntu Latest (GitHub Actions)
Commit: 25b6a25
Branch: 88/merge
Triggered by: pull_request

📏 Binary Size Analysis (AC8.2)

Binary Size Status
samoyed 938656 bytes
samoyed-hook 352928 bytes
Total 1291584 bytes < 10MB

🧠 Memory Usage Analysis (AC8.3)

Component Memory Usage Status
samoyed init 4268 KB
samoyed-hook 1912 KB
Limit 50 MB All under limit

⚡ Performance Benchmarks

Metric Value Target Status
Hook Execution Overhead null ms < 50ms
Startup Time TBD < 100ms
File Operations TBD Efficient

📈 Performance Summary

  • AC8.1: Hook execution overhead < 50ms
  • AC8.2: Binary size < 10MB
  • AC8.3: Memory usage < 50MB
  • AC8.4: Startup time < 100ms
  • AC8.5: Efficient file system operations

Full benchmark results available in workflow artifacts.

@behrangsa behrangsa merged commit f51fbd6 into master Aug 4, 2025
16 checks passed
@behrangsa behrangsa deleted the perf/optimize-binary-size-and-performance branch August 4, 2025 17:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants