miniVite

Science

MiniVITE is a proxy app for the Vite program. Vite itself is the distributed-memory version of Grappolo.

The program performs community detection over a graph, which is used in graph applications to reveal natural divisions that exist in networks without size or element constraints. One obvious example for the application of community detection is Facebook, where millions of users connect through “friend” links, and rich social networks are organized via these connections. It also has applications in the analysis of sequences in biological data. You can find a more detailed description of the science and the algorithms used here.

Code

Compilation

MiniVITE requires a C++ 11 compiler with OpenMP support, and an MPI implementation. No other libraries are required.

The build system is a simple Makefile file that needs to be edited to specify the compiler and flags. There are several compile-time options that are selected by pre-processor macros. In particular, it’s possible to select the communication approach:

  • Non-blocking point to point communication (the default).
  • Standard point to point communication.
  • Collective communication.
  • Remote memory access (RMA).

Check the README for details.

Execution

To run miniVITE with a randomly generated graph just specify the number of vertices in the graph with the -n flag

mpirun -n 4 ./dspl -n 500000

to run with a graph from a file run with -f

mpirun -n 4 ./dspl -f graph.bin

Validation

Since the execution pattern changes with the number of processors, it’s not expected for the results to be identical between runs of different sizes.

The quantity to evaluate the results of a calculation is modularity, which is printed by the code as a result. The modularity measures the quality of the result as a number between -1 and 1, with values closer to 1 indicating a better the quality of the community assignment (for a fixed graph).

To check the results, the modularity value should not decrease significantly from a reference run for the same graph.

Performance analysis

MiniVITE main performance limitation is the communication (from the reported calculations, between 50% to 80% of the time is spent in communication). Of this cost, a large fraction is a reduction operation, and the rest is the update of community information between nodes that can be done using different MPI communications strategies (see the Compilation section).

Similarity to parent application

Unlike Vite, miniVITE does not implement any heuristics to improve the performance of the Louvain method. miniVite is a baseline parallel version, implementing only the first phase of Louvain method. These are several heuristics, including distance-1 coloring and vertex following, that are used to parallelize the Louvain method efficiently.