You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -1009,64 +965,19 @@ GraphX comes with static and dynamic implementations of PageRank as methods on t
1009
965
1010
966
GraphX also includes an example social network dataset that we can run PageRank on. A set of users is given in `data/graphx/users.txt`, and a set of relationships between users is given in `data/graphx/followers.txt`. We compute the PageRank of each user as follows:
1011
967
1012
-
{% highlight scala %}
1013
-
// Load the edges as a graph
1014
-
val graph = GraphLoader.edgeListFile(sc, "data/graphx/followers.txt")
1015
-
// Run PageRank
1016
-
val ranks = graph.pageRank(0.0001).vertices
1017
-
// Join the ranks with the usernames
1018
-
val users = sc.textFile("data/graphx/users.txt").map { line =>
The connected components algorithm labels each connected component of the graph with the ID of its lowest-numbered vertex. For example, in a social network, connected components can approximate clusters. GraphX contains an implementation of the algorithm in the [`ConnectedComponents` object][ConnectedComponents], and we compute the connected components of the example social network dataset from the [PageRank section](#pagerank) as follows:
1032
973
1033
-
{% highlight scala %}
1034
-
// Load the graph as in the PageRank example
1035
-
val graph = GraphLoader.edgeListFile(sc, "data/graphx/followers.txt")
1036
-
// Find the connected components
1037
-
val cc = graph.connectedComponents().vertices
1038
-
// Join the connected components with the usernames
1039
-
val users = sc.textFile("data/graphx/users.txt").map { line =>
A vertex is part of a triangle when it has two adjacent vertices with an edge between them. GraphX implements a triangle counting algorithm in the [`TriangleCount` object][TriangleCount] that determines the number of triangles passing through each vertex, providing a measure of clustering. We compute the triangle count of the social network dataset from the [PageRank section](#pagerank). *Note that `TriangleCount` requires the edges to be in canonical orientation (`srcId < dstId`) and the graph to be partitioned using [`Graph.partitionBy`][Graph.partitionBy].*
1053
979
1054
-
{% highlight scala %}
1055
-
// Load the edges in canonical order and partition the graph for triangle count
1056
-
val graph = GraphLoader.edgeListFile(sc, "data/graphx/followers.txt", true).partitionBy(PartitionStrategy.RandomVertexCut)
1057
-
// Find the triangle count for each vertex
1058
-
val triCounts = graph.triangleCount().vertices
1059
-
// Join the triangle counts with the usernames
1060
-
val users = sc.textFile("data/graphx/users.txt").map { line =>
1061
-
val fields = line.split(",")
1062
-
(fields(0).toLong, fields(1))
1063
-
}
1064
-
val triCountByUsername = users.join(triCounts).map { case (id, (username, tc)) =>
0 commit comments