Problem
Currently, Scala 3 compiler generates SemanticDB using (approx 8000 LoC) protobuf bindings copy-pasted from tanishiking/semanticdb-for-scala3. That repo uses ScalaPB to generate protobuf bindings, then strips unnecessary ScalaPB runtime code via a scalafix rule, bundling the required parts from runtime.
This way, we can avoid pulling in external dependencies like scalapb-runtime (and transitively protobuf-java) into the compiler.
However, the scalafix rules used to strip the runtime are quite heuristic and difficult to maintain. In practice, keeping up with SemanticDB schema changes or ScalaPB updates is hard, and changes on those end up being manually patched in the copied code and then feed back upstream to semanticdb-for-scala3, which is an awkward workflow.
Proposal
Use sbt-protoc (or a similar tool) to generate Java or Scala protobuf bindings directly in compiler.
Pros
- Removes a lot of fragile copy-pasted, (almost) hand-maintained generated code
- Makes schema updates routine much simpler, rather than a custom multi-step workflow on
semanticdb-for-scala3, and reduce the risk of falling behind the upstream SemanticDB schema / scalapb updates (including vulnerability fix).
Cons
- The compiler pulls in extra runtime dependencies on the classpath. If using ScalaPB:
scalapb-runtime + (transitviely protobuf-java, and lenses) approximately 4.12 MB total.
If it's better to minimize runtime dependency, switching to protobuf-java or protobuf-javalite bindings directly could reduce the footprint to around 1MB jar (for protobuf-javalite-runtime).
I'm wondering there's a hard rule against adding any external runtime libraries to the compiler, or if it's acceptable under certain conditions. 🤔
Problem
Currently, Scala 3 compiler generates SemanticDB using (approx 8000 LoC) protobuf bindings copy-pasted from tanishiking/semanticdb-for-scala3. That repo uses ScalaPB to generate protobuf bindings, then strips unnecessary ScalaPB runtime code via a scalafix rule, bundling the required parts from runtime.
This way, we can avoid pulling in external dependencies like
scalapb-runtime(and transitivelyprotobuf-java) into thecompiler.However, the scalafix rules used to strip the runtime are quite heuristic and difficult to maintain. In practice, keeping up with SemanticDB schema changes or ScalaPB updates is hard, and changes on those end up being manually patched in the copied code and then feed back upstream to
semanticdb-for-scala3, which is an awkward workflow.Proposal
Use
sbt-protoc(or a similar tool) to generate Java or Scala protobuf bindings directly in compiler.Pros
semanticdb-for-scala3, and reduce the risk of falling behind the upstream SemanticDB schema / scalapb updates (including vulnerability fix).Cons
scalapb-runtime+ (transitvielyprotobuf-java, andlenses) approximately 4.12 MB total.If it's better to minimize runtime dependency, switching to
protobuf-javaorprotobuf-javalitebindings directly could reduce the footprint to around 1MB jar (for protobuf-javalite-runtime).I'm wondering there's a hard rule against adding any external runtime libraries to the compiler, or if it's acceptable under certain conditions. 🤔