Graph partitioning format
by Rebecca Dengate
Hello,
I am outputting a very simple wiki graph using:
hadoop jar target/graphbuilder-0.0.1-SNAPSHOT-hadoop-job.jar
com.intel.hadoop.graphbuilder.demoapps.wikipedia.linkgraph.LinkGraphEnd2End
2 /user/hduser/wiki-input /user/hduser/wiki-output
The wiki file contains two 3x3 grids linked so that each page is
connected to the neighbour on the right, the one below, and the next
grid (e.g. AfghanistanHistory links to AfghanistanGeography,
AfghanistanCommunications, and AlbaniaHistory), making a total of 33
edges. I'm able to email the example wiki file if this is of interest.
Wiki-grids (where the number in brackets is the gvid assigned by
GraphBuilder):
AccessibleComputing (13) AfghanistanHistory (0) AfghanistanGeography (7)
AfghanistanPeople (1) AfghanistanCommunications (12)
AfghanistanTransportations (2)
AfghanistanMilitary (15) AfghanistanTransnationalIssues (4)
AssistiveTechnology (8)
AmoeboidTaxa (9) AlbaniaHistory (6) AlbaniaPeople (11)
AsWeMayThink (17) AlbaniaGovernment (10) AlbaniaEconomy (14)
AfroAsiaticLanguages (3) ArtificalLanguages (5) AbacuS (16)
The vertex and edge lists that are created in graph_raw are correct,
however the graph_partitioning is a little odd.
graph_partitioned/edges/partition0/subpart0:
{"source":"0","targets":["12","6"]}
{"source":"11","targets":["14"]}
{"source":"12","targets":["4","2"]}
{"source":"13","targets":["9","0"]}
{"source":"15","targets":["4"]}
{"source":"2","targets":["8","14"]}
{"source":"4","targets":["8"]}
{"source":"6","targets":["10","11"]}
{"source":"7","targets":["11"]}
{"source":"9","targets":["17","6"]}
graph_partitioned/vrecords/partition0:
{"vdata":{},"mirrors":[],"inEdges":1,"gvid":"0","outEdges":1,"owner":0}
{"vdata":{},"mirrors":[1],"inEdges":3,"gvid":"10","outEdges":1,"owner":0}
{"vdata":{},"mirrors":[0],"inEdges":2,"gvid":"12","outEdges":3,"owner":1}
{"vdata":{},"mirrors":[0],"inEdges":0,"gvid":"13","outEdges":2,"owner":1}
{"vdata":{},"mirrors":[1],"inEdges":0,"gvid":"15","outEdges":2,"owner":0}
{"vdata":{},"mirrors":[],"inEdges":0,"gvid":"16","outEdges":0,"owner":0}
{"vdata":{},"mirrors":[],"inEdges":1,"gvid":"17","outEdges":0,"owner":0}
{"vdata":{},"mirrors":[1],"inEdges":2,"gvid":"2","outEdges":2,"owner":0}
{"vdata":{},"mirrors":[],"inEdges":0,"gvid":"3","outEdges":0,"owner":0}
{"vdata":{},"mirrors":[0],"inEdges":1,"gvid":"7","outEdges":2,"owner":1}
{"vdata":{},"mirrors":[],"inEdges":1,"gvid":"9","outEdges":0,"owner":0}
The vertices owned by 0 don't match the edges in subpart0 -- I'm
assuming that the set of edges in subpart0 are the edges operated on in
partition0, which has access to the vertices in vrecords/partition0.
Note that gvids 4 and 8 don't appear in vrecords, but are used as source
and destination vertices in the subpart0 edge set. Gvids 16 and 3 appear
in the vrecords partition but don't appear in the edge list.
Also, the inEdges and outEdges are not as I expected for some vertices.
For example, gvid 0 has inEdges 1 and outEdges 1, but surely it should
have three outEdges (as it connects to 7, 12 and 6)?
Have I misunderstood the format?
Thanks for your help.
Cheers,
Rebecca
9 years, 1 month