-
-
Notifications
You must be signed in to change notification settings - Fork 684
Enhance Java first party dependency inference to support unqualified, same-package references #22817
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Enhance Java first party dependency inference to support unqualified, same-package references #22817
Conversation
|
I'm moving this into a draft status for now. There's something off with my real-world test project. I'll move this out of draft once that verification test is passing |
|
Thanks for the contribution! Let us know here when this is ready for review (feel free to cc me by name to get my attention). |
|
@benjyw This is ready for review. Can you take a look at it? |
benjyw
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this! I'm familiarizing myself with the code and trying to understand why this is necessary in the first place...
|
|
||
| # For each direct dependency, resolve its exported types as transitive dependencies | ||
| # This handles the case where A imports B, and B has a field of type C | ||
| # Even though A never imports C, the Java compiler needs C's class file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know the JVM backend too well, but I guess this is necessary because we only put the direct deps of A on the compiler classpath? So we have to treat C as a direct dep.
But shouldn't we really be putting all transitive deps on the compiler classpath? After all, couldn't we contrive a more complex situation where C has a reference to some D etc?
And if we did so then presumably this problem wouldn't occur, because we do detect that B depends on C and A depends on B, and therefore C is in the transitive closure?
Or are there reasons not to put all transitive dep classes classpath?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. In my experience, all transitive deps should be on the classpath when compiling Java sources, so I'm not sure why we're not doing it here. I agree that's a better approach. Let me revisit this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming the reason for not just putting all transitive dep classes on the classpath was potentially performance related? (although I didn't run a performance test with and without this change) Putting all transitive dep classes on the classpath is good from a correctness standpoint and obviates the need for us to put special logic in place for exporting certain classes and references ahead to other consumers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly, I'll see if I can find someone who has context for this decision. As far as I can tell (and AI seems to agree) compiling A conceptually requires all its transitive deps, and trying to tease out which ones are "really" needed is a mug's game.
I had thought we were already doing this though, I'm a bit surprised if we're not. But if we were your use case would work, no? I assume we do correctly infer the deps A->B and B->C already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, if we were already doing this, then I wouldn't have had compilation errors in my project for the A -> B -> C use case. But the other issue this PR is solving around inner classes still would have been a problem, but now it seems the two are unrelated aside from the fact they require fixes to the same files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it probably makes sense to split this into two PRs. The inner classes one is fairly straightforward and we can get it in ASAP. The other is a potentially more substantial change that would be best reviewed in isolation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's a good idea. I've split the inner class changes out into #22889 I'll rebase this PR to just include the transitive dep stuff after that other PR lands.
| export.accept(methodDecl.getType()); | ||
| consumed.accept(methodDecl.getType()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The export consumer was adding to both lists. Since exports aren't needed anymore lines like this just call consumed.accept instead of being removed entirely
In Java, it's perfectly valid for a class in one package to reference another class in the same package without importing it. The current Java dependency inference code works fine for top level references, but doesn't work when inner classes are involved. Suppose we have class A in A.java and class B in B.java with a static inner class InnerB. All of these classes are in the same package. A uses B.InnerB, but doesn't explicitly import B. This is valid practice in Java, but with our Pants Java dependency inference logic, the backend won't figure out on its own that A.java depends on B.java without an explicit dependency reference. This PR contains the changes for fixing this bug as seen in #22817 I've split it into its own PR to make things easier to review and land
|
Hi, just checking in on the status of this, given the other PRs on this subject? |
I still plan on dusting this off now that the other changes have been merged through separate PRs. I'm hoping to get to it next week |
This PR fixes two issues with Pants Java first-party dependency inference logic.
Inner Classes
In Java, it's perfectly valid for a class in one package to reference another class in the same package without importing it. The current Java dependency inference code works fine for top level references, but doesn't work when inner classes are involved. Suppose we have class
AinA.javaand classBinB.javawith a static inner classInnerB. All of these classes are in the same package.AusesB.InnerB, but doesn't explicitly importB. This is valid practice in Java, but with our Pants Java dependency inference logic, the backend won't figure out on its own thatA.javadepends onB.javawithout an explicit dependency reference.Non-obvious transitive references.
Suppose we have classes A, B, and C.
Today, our Pants Java dependency inference code will fail to pick up the dependency between A and C. Specifically, if you try to compile A, it will fail, saying that the class file for C is missing.
Initially, I tried to solve this be adding more exports in our java parser. But it seems the better approach involves just putting all transitive dependency classes on the classpath at compile time
Other Notes
This is my first contribution, so let me know if I'm missing anything obvious.
I ran
/pants test src/python/pants/backend/java/::and all of those tests are passing.I also have a separate Java project which I'm testing out with Pants. I set PANTS_SOURCE to my local branch with these changes and it resolved both of my aforementioned compilation errors.