-
Notifications
You must be signed in to change notification settings - Fork 27
Resolution and Stitching
Maven ecosystem-wide dependency graph is used in order to resolve all (transitive) dependencies and dependents of any Maven artifact.
NB! Dependency graph is quite big and requires a lot of RAM. It is better to run everything listed below on a machine which has more than 16GB of RAM (32GB should be enough).
The global dependency graph is built in the DependencyGraphBuilder class in core/maven.
In order to build a dependency graph, create an instance of this class and call buildDependencyGraph(...) method:
var dbContext = PostgresConnector.getDSLContext("jdbc:postgresql://localhost:5432/fasten_java", "fastenro");
var graphBuilder = new DependencyGraphBuilder();
var dependencyGraph = graphBuilder.buildDependencyGraph(dbContext);Then the dependencyGraph can be used for any kinds of analyses or resolutions.
NB! When running the code, don't forget to add the environmental variable PGPASSWORD=fasten to provide a password for the database connection.
In order to resolve all (transitive) dependencies, use GraphMavenResolver class which is also in the core/maven and call method resolveFullDependencySet(...):
var graphResolver = new GraphMavenResolver();
var dependencySet = graphResolver.resolveFullDependencySet(group, artifact, version, timestamp, scopes, dbContext, filterOptional, filterScopes, filterExclusions);-
group- groupId of the artifact to resolve -
artifact- artifactId of the artifact to resolve -
version- version of the artifact to resolve -
timestamp- a timestamp to filter the dependency graph to remove all artifacts which were released later than the given timestamp. Use-1to disable timestamp filtering. -
scopes- list of scopes for filtering. All artifacts whose scopes are not on the list will be removed from the dependency graph. Only artifacts with scopes that are on the provided list will remain. -
dbContext- database connection context. Can be acquired the same way as in this code snippet (var dbContext = PostgresConnector.getDSLContext("jdbc:postgresql://localhost:5432/fasten_java", "fastenro");). -
filterOptional- a boolean for whether to filter out optional dependencies or not -
filterScopes- a boolean for whether to filter out by scopes (a list which was provided earlier) or not. -
filterExclusions- a boolean for whether to filter out excluded dependencies or not.
The same class contains a method for resolving all (transitive) dependents called resolveFullDependentsSet(...) and takes the same arguments as dependency resolution:
var graphResolver = new GraphMavenResolver();
var dependentsSet = graphResolver.resolveFullDependentsSet(group, artifact, version, timestamp, scopes, dbContext, filterOptional, filterScopes, filterExclusions);NB! All kinds of filtering have not been optimized yet and take a lot of time to execute. For faster results avoid filtering.
If the database is not available then online resolution can be used for resolving dependencies. It downloads POM file of the given artifact, then runs mvn dependency:list and parses the output. Here is how it can be used:
var mavenResolver = new MavenResolver();
var depedendencySet = mavenResolver.resolveFullDependencySetOnline(groupId, artifactId, version);Let's assume that we have dependency set D = {A,B,C}. If we have the isolated call graphs for A, B, and C then we can Stitch them with respect to D. Precomputed FASTEN call graphs (RCGs, ERCGs, etc..) all have a section that indicates the calls going outside of the package. This part of these graphs is called externalCalls. These calls are the ones that we cannot fully resolve in isolation due to not having sufficient information about their libraries. Once we have a dependency set the context is available for resolving such calls. There is another section in the FASTEN precomputed graphs, called resolved calls. This part is empty after CGs generated in isolation and it will be filled after the Stitching is done. The Stitching Algorithm finds the valid targets for externalCalls within RCGs with respect to a provided dependency set. For example, once the Dependency set D and RCGs of A, B, and C are available, Stitching can find out where the target of each externalCall belongs with respect to D.
Note that each RCG needs to be Stitched separately.
For example, if someone wants to figure out all the possible edges in every package existing in D,
he/she needs to first create a context (merger instantiation using D:var merger = new Merger(D)) and then Stitch A, B, and C.
There is a merger.mergeWithCHA(ERCG artifact) method provided for doing that.
This provides flexibility not to Stitch everything present in the dependency set and only Stitch the packages that are needed.
There are two options currently available for Stitching the Revision Call Graphs, First Option works with RCG objects which we have already stored as serialized Jsons in FASTEN servers, and the second option uses the information available in the Graph and Metadata database. The Java implementation is available in the develop branch.
In order to do the Stitching for an ExtendedRevisionJavaCallGraph object A (example above) with respect to D (List<ExtendedRevisionJavaCallGraph>) one need to first instantiate the merger (LocalMerger) object using D and then merge the desired CG object (in this case A) as follows:
var localMerger = new LocalMerger(D);
var mergedA = localMerger.mergeWithCHA(A);mergedA is an ExtendedRevisionJavaCallGraph object similar to A but the resolved section of this object is filled with the correct edges that can happen between A and its libraries in the context of D.
Similar to the previous example one needs to use a merger (DatabaseMerger) instance in order to perform Stitching. This merger also needs dbContext, and rocksDao to work:
var dbContext = PostgresConnector.getDSLContext("DBUrl","DBUser");
var rocksDao = RocksDBConnector.createReadOnlyRocksDBAccessObject("GaphDBDir”);Similar to LocalMerger one needs to provide dependency set for merger and artifact to merge for merge method as follows:
var databaseMerger = new DatabaseMerger(depSet, dbContext, rocksDao);
var mergedDirectedGraph = databaseMerger.mergeWithCHA(artifact);Note that DBMerger works with Maven coordinates instead of ERCG objects. So the depSet is a List<String> in which each String is a groupId:artifactId:version (e.g. org.digidoc4j:digidoc4j:1.0.7.beta.2). artifact is also a String specifying the maven coordinate of the artifact to resolve. Also, the output of the DBMerge is a DirectedGraph including the GlobalIDs of nodes stored in the GraphDB.
Currently, both resolution and stitching for Java are available in the develop branch.