|
| 1 | +# Design Document |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +This design outlines the implementation of a new Cargo lint called `invalid_license_expression` that validates `package.license` fields against SPDX license expression standards. The lint will be integrated into Cargo's existing linting infrastructure and will help package authors ensure their license expressions are properly formatted and parseable by tools that consume SPDX data. |
| 6 | + |
| 7 | +## Architecture |
| 8 | + |
| 9 | +The lint will be implemented as part of Cargo's existing lint system in `src/cargo/util/lints.rs`. It will follow the same patterns as other Cargo lints like `unknown_lints` and `im_a_teapot`, integrating seamlessly with the current lint infrastructure. |
| 10 | + |
| 11 | +### Key Components |
| 12 | + |
| 13 | +1. **Lint Definition**: A new `Lint` constant that defines the lint's metadata, default level, and edition-specific behavior |
| 14 | +2. **Validation Function**: A function that checks license expressions using the SPDX crate |
| 15 | +3. **Error Reporting**: Integration with Cargo's diagnostic system to provide helpful error messages |
| 16 | +4. **Edition-based Defaults**: Different default levels for different Cargo editions |
| 17 | + |
| 18 | +## Components and Interfaces |
| 19 | + |
| 20 | +### Lint Definition |
| 21 | + |
| 22 | +```rust |
| 23 | +const INVALID_LICENSE_EXPRESSION: Lint = Lint { |
| 24 | + name: "invalid_license_expression", |
| 25 | + desc: "invalid SPDX license expression", |
| 26 | + groups: &[], |
| 27 | + default_level: LintLevel::Warn, |
| 28 | + edition_lint_opts: Some((Edition::Edition2027, LintLevel::Deny)), |
| 29 | + feature_gate: None, |
| 30 | + docs: Some(/* documentation */), |
| 31 | +}; |
| 32 | +``` |
| 33 | + |
| 34 | +### Validation Function |
| 35 | + |
| 36 | +The main validation function will: |
| 37 | +1. Extract the license field from the package manifest |
| 38 | +2. Use the `spdx` crate to parse and validate the expression |
| 39 | +3. Generate appropriate diagnostics for invalid expressions |
| 40 | +4. Provide helpful suggestions for common mistakes |
| 41 | + |
| 42 | +```rust |
| 43 | +pub fn check_invalid_license_expression( |
| 44 | + pkg: &Package, |
| 45 | + path: &Path, |
| 46 | + pkg_lints: &TomlToolLints, |
| 47 | + error_count: &mut usize, |
| 48 | + gctx: &GlobalContext, |
| 49 | +) -> CargoResult<()> |
| 50 | +``` |
| 51 | + |
| 52 | +### SPDX Integration |
| 53 | + |
| 54 | +The lint will use the `spdx` crate for validation. Based on research, we'll use a recent version that aligns with current SPDX standards: |
| 55 | +- SPDX specification version 2.3 (as referenced in Cargo documentation) |
| 56 | +- Compatible with crates.io's validation requirements |
| 57 | + |
| 58 | +## Data Models |
| 59 | + |
| 60 | +### License Expression Validation |
| 61 | + |
| 62 | +The validation will handle: |
| 63 | +- **Valid expressions**: `MIT`, `Apache-2.0`, `MIT OR Apache-2.0`, `GPL-3.0-or-later WITH Classpath-exception-2.0` |
| 64 | +- **Invalid expressions**: `MIT / Apache-2.0`, `MIT and Apache-2.0`, malformed expressions |
| 65 | +- **Edge cases**: Empty strings, whitespace-only strings, case sensitivity |
| 66 | + |
| 67 | +### Error Context |
| 68 | + |
| 69 | +Error messages will include: |
| 70 | +- The invalid expression |
| 71 | +- Specific location in the manifest file |
| 72 | +- Suggestions for common fixes (e.g., replacing `/` with `OR`) |
| 73 | +- Reference to SPDX documentation |
| 74 | + |
| 75 | +## Error Handling |
| 76 | + |
| 77 | +### Diagnostic Generation |
| 78 | + |
| 79 | +The lint will generate structured diagnostics using Cargo's `annotate-snippets` system: |
| 80 | + |
| 81 | +```rust |
| 82 | +let message = level |
| 83 | + .title("invalid SPDX license expression") |
| 84 | + .snippet( |
| 85 | + Snippet::source(manifest.contents()) |
| 86 | + .origin(&manifest_path) |
| 87 | + .annotation(level.span(license_span)) |
| 88 | + .fold(true), |
| 89 | + ) |
| 90 | + .footer(Level::Help.title("use valid SPDX operators like 'OR' instead of '/'")); |
| 91 | +``` |
| 92 | + |
| 93 | +### Common Error Patterns |
| 94 | + |
| 95 | +1. **Slash operator**: `MIT / Apache-2.0` → suggest `MIT OR Apache-2.0` |
| 96 | +2. **Lowercase operators**: `mit and apache-2.0` → suggest `MIT AND Apache-2.0` |
| 97 | +3. **Invalid license identifiers**: `Apache2` → suggest `Apache-2.0` |
| 98 | +4. **Malformed expressions**: Missing parentheses, invalid syntax |
| 99 | + |
| 100 | +## Testing Strategy |
| 101 | + |
| 102 | +### Unit Tests |
| 103 | + |
| 104 | +Unit tests will be located in `src/cargo/util/lints.rs` and will focus on testing the core validation logic: |
| 105 | + |
| 106 | +1. **SPDX parsing**: Test the `spdx` crate integration directly |
| 107 | +2. **Lint level calculation**: Test edition-based and configuration-based lint levels |
| 108 | +3. **Error message generation**: Test diagnostic message formatting |
| 109 | +4. **Edge cases**: Empty strings, whitespace, malformed input |
| 110 | + |
| 111 | +### Integration Tests |
| 112 | + |
| 113 | +Integration tests will follow Cargo's established patterns and be located in `tests/testsuite/lints.rs` alongside existing lint tests. These will use functional tests with the `project()` builder pattern. |
| 114 | + |
| 115 | +#### Functional Tests |
| 116 | + |
| 117 | +All integration tests will be functional tests using programmatically defined fixtures and in-source snapshots: |
| 118 | + |
| 119 | +```rust |
| 120 | +use crate::prelude::*; |
| 121 | +use cargo_test_support::str; |
| 122 | +use cargo_test_support::project; |
| 123 | + |
| 124 | +#[cargo_test] |
| 125 | +fn invalid_license_expression_warns_by_default() { |
| 126 | + let p = project() |
| 127 | + .file("Cargo.toml", r#" |
| 128 | + [package] |
| 129 | + name = "foo" |
| 130 | + version = "0.1.0" |
| 131 | + license = "MIT / Apache-2.0" |
| 132 | + "#) |
| 133 | + .file("src/lib.rs", "") |
| 134 | + .build(); |
| 135 | + |
| 136 | + p.cargo("check") |
| 137 | + .with_stderr_data(str![[r#" |
| 138 | +[WARNING] invalid SPDX license expression: `MIT / Apache-2.0` |
| 139 | + --> Cargo.toml:4:23 |
| 140 | + | |
| 141 | + 4 | license = "MIT / Apache-2.0" |
| 142 | + | ^^^^^^^^^^^^^^^^^^ |
| 143 | + | |
| 144 | + = help: use valid SPDX operators like 'OR' instead of '/' |
| 145 | + = note: `cargo::invalid_license_expression` is set to `warn` by default |
| 146 | +[CHECKING] foo v0.1.0 ([ROOT]/foo) |
| 147 | +[FINISHED] [..] |
| 148 | +"#]]) |
| 149 | + .run(); |
| 150 | +} |
| 151 | + |
| 152 | +#[cargo_test] |
| 153 | +fn valid_license_expression_no_warning() { |
| 154 | + let p = project() |
| 155 | + .file("Cargo.toml", r#" |
| 156 | + [package] |
| 157 | + name = "foo" |
| 158 | + version = "0.1.0" |
| 159 | + license = "MIT OR Apache-2.0" |
| 160 | + "#) |
| 161 | + .file("src/lib.rs", "") |
| 162 | + .build(); |
| 163 | + |
| 164 | + p.cargo("check") |
| 165 | + .with_stderr_data(str![[r#" |
| 166 | +[CHECKING] foo v0.1.0 ([ROOT]/foo) |
| 167 | +[FINISHED] [..] |
| 168 | +"#]]) |
| 169 | + .run(); |
| 170 | +} |
| 171 | + |
| 172 | +#[cargo_test] |
| 173 | +fn lint_level_deny_fails_build() { |
| 174 | + let p = project() |
| 175 | + .file("Cargo.toml", r#" |
| 176 | + [package] |
| 177 | + name = "foo" |
| 178 | + version = "0.1.0" |
| 179 | + license = "MIT / Apache-2.0" |
| 180 | +
|
| 181 | + [lints.cargo] |
| 182 | + invalid_license_expression = "deny" |
| 183 | + "#) |
| 184 | + .file("src/lib.rs", "") |
| 185 | + .build(); |
| 186 | + |
| 187 | + p.cargo("check") |
| 188 | + .with_status(101) |
| 189 | + .with_stderr_data(str![[r#" |
| 190 | +[ERROR] invalid SPDX license expression: `MIT / Apache-2.0` |
| 191 | + --> Cargo.toml:4:23 |
| 192 | + | |
| 193 | + 4 | license = "MIT / Apache-2.0" |
| 194 | + | ^^^^^^^^^^^^^^^^^^ |
| 195 | + | |
| 196 | + = help: use valid SPDX operators like 'OR' instead of '/' |
| 197 | + = note: `cargo::invalid_license_expression` is set to `deny` in `[lints]` |
| 198 | +"#]]) |
| 199 | + .run(); |
| 200 | +} |
| 201 | +``` |
| 202 | + |
| 203 | +#### Test Organization |
| 204 | + |
| 205 | +Tests will be added to the existing `tests/testsuite/lints.rs` file alongside other lint tests like `unknown_lints` and `im_a_teapot`. This follows the established pattern of grouping related lint tests together. |
| 206 | + |
| 207 | +#### Test Categories |
| 208 | + |
| 209 | +1. **Basic Validation Tests** |
| 210 | + - Valid SPDX expressions (MIT, Apache-2.0, MIT OR Apache-2.0) |
| 211 | + - Invalid expressions (MIT / Apache-2.0, mit and apache) |
| 212 | + - Malformed expressions (unclosed parentheses, invalid operators) |
| 213 | + - Edge cases (empty strings, whitespace-only) |
| 214 | + |
| 215 | +2. **Edition Behavior Tests** |
| 216 | + - Edition 2024: Default warn level |
| 217 | + - Edition 2027: Default deny level |
| 218 | + - Edition transitions and compatibility |
| 219 | + |
| 220 | +3. **Configuration Tests** |
| 221 | + - Lint level overrides in `[lints.cargo]` |
| 222 | + - Workspace-level lint configuration |
| 223 | + - Package-level overrides of workspace settings |
| 224 | + |
| 225 | +4. **Workspace Tests** |
| 226 | + - Workspace root license inheritance |
| 227 | + - Member package license validation |
| 228 | + - Mixed valid/invalid licenses across workspace |
| 229 | + |
| 230 | +5. **Error Reporting Tests** |
| 231 | + - Diagnostic message format and content |
| 232 | + - Span highlighting accuracy |
| 233 | + - Helpful suggestions for common mistakes |
| 234 | + |
| 235 | +6. **Command Integration Tests** |
| 236 | + - Behavior during different Cargo commands (check, build, publish) |
| 237 | + - Interaction with other lints |
| 238 | + |
| 239 | +### Test Implementation Details |
| 240 | + |
| 241 | +#### Functional Test Pattern |
| 242 | + |
| 243 | +All tests will follow the established functional test pattern: |
| 244 | +- Use `project()` to create test fixtures programmatically |
| 245 | +- Use `str![[]]` macros for in-source snapshots |
| 246 | +- Use pattern matching with `[..]` for variable content |
| 247 | +- Self-contained test cases that can be easily shared in issues |
| 248 | + |
| 249 | +#### Cross-Platform Considerations |
| 250 | + |
| 251 | +- Use `/` for paths in expected output (automatically converted on Windows) |
| 252 | +- Use `[ROOT]` placeholder for project root paths |
| 253 | +- Pattern matching handles platform-specific variations |
| 254 | + |
| 255 | +### Performance Testing |
| 256 | + |
| 257 | +- Benchmark lint execution time with large workspaces |
| 258 | +- Test memory usage with many license expressions |
| 259 | +- Verify no significant impact on build times |
| 260 | + |
| 261 | +## Implementation Plan |
| 262 | + |
| 263 | +### Phase 1: Core Implementation |
| 264 | +1. Add SPDX crate dependency to Cargo.toml |
| 265 | +2. Define the lint constant in `lints.rs` |
| 266 | +3. Implement basic validation function |
| 267 | +4. Add lint to the `LINTS` array |
| 268 | + |
| 269 | +### Phase 2: Error Handling |
| 270 | +1. Implement comprehensive error messages |
| 271 | +2. Add suggestions for common mistakes |
| 272 | +3. Handle edge cases and malformed input |
| 273 | +4. Test error reporting |
| 274 | + |
| 275 | +### Phase 3: Integration |
| 276 | +1. Integrate with existing lint infrastructure |
| 277 | +2. Add edition-specific behavior |
| 278 | +3. Ensure proper workspace inheritance |
| 279 | +4. Add comprehensive tests |
| 280 | + |
| 281 | +### Phase 4: Documentation |
| 282 | +1. Write lint documentation with examples |
| 283 | +2. Update Cargo documentation |
| 284 | +3. Add migration guide for common issues |
| 285 | +4. Create examples of valid/invalid expressions |
| 286 | + |
| 287 | +## Dependencies |
| 288 | + |
| 289 | +### New Dependencies |
| 290 | + |
| 291 | +The implementation will require adding the `spdx` crate as a dependency. Based on research of SPDX standards and crates.io compatibility: |
| 292 | + |
| 293 | +```toml |
| 294 | +[dependencies] |
| 295 | +spdx = "0.10" # Latest stable version supporting SPDX 2.3 |
| 296 | +``` |
| 297 | + |
| 298 | +### Existing Dependencies |
| 299 | + |
| 300 | +The lint will leverage existing Cargo infrastructure: |
| 301 | +- `annotate-snippets` for error reporting |
| 302 | +- `toml` for manifest parsing |
| 303 | +- `cargo-util-schemas` for manifest structure |
| 304 | +- Existing lint framework in `src/cargo/util/lints.rs` |
| 305 | + |
| 306 | +## Compatibility |
| 307 | + |
| 308 | +### Backward Compatibility |
| 309 | + |
| 310 | +The lint is designed to be non-breaking: |
| 311 | +- Default level is `Warn` in Edition 2024, so existing builds continue to work |
| 312 | +- Only becomes `Deny` by default in future editions (2027+) |
| 313 | +- Can be disabled by setting lint level to `Allow` |
| 314 | + |
| 315 | +### Forward Compatibility |
| 316 | + |
| 317 | +- Uses stable SPDX specification (2.3) |
| 318 | +- Aligns with crates.io validation requirements |
| 319 | +- Extensible for future SPDX specification updates |
| 320 | + |
| 321 | +## Performance Considerations |
| 322 | + |
| 323 | +### Validation Performance |
| 324 | + |
| 325 | +- SPDX parsing is performed only when the lint is enabled and not set to `Allow` |
| 326 | +- Validation occurs only during manifest processing, not during builds |
| 327 | +- Caching of parsed expressions within the same Cargo invocation |
| 328 | +- Minimal impact on build times |
| 329 | + |
| 330 | +### Memory Usage |
| 331 | + |
| 332 | +- SPDX crate has minimal memory footprint |
| 333 | +- License expressions are typically short strings |
| 334 | +- No persistent state or large data structures required |
0 commit comments