Programs are rarely built from a single source code file. Normally, code is organized into several source files, reaching hundreds or even thousands of files for large programs. Static code analysis tools work by analyzing the source code files. When a function defined in a source file calls another function defined in a different source file, static analysis will inspect the bodies of both the caller function and the called function (and do the same for any other functions called from there). This is called interprocedural analysis across source code files. Obviously, for it to work, static code analysis tools must have access to all of the source code files where the functions are defined.

Obvious as it is, this is not always simple. Think for instance in the way programs are built in the C programming language: each source file is compiled individually and calls to functions defined in other source files are resolved during the linking process. All a source file needs to know about called functions is their declaration. A function declaration serves as a function signature and is normally located in header files. By including these header files, the compiler has enough information to compile a source file calling one of those functions. The actual function definition containing the function body will be located in some other source file and the compiler is not concerned with it: it will be compiled separately and it won’t be until linkage that undefined function errors may arise.
As stated before, static code analysis tools need access to the function definitions in order for interprocedural analysis to succeed. Thus, unlike compilers, they need to analyze several source code files together to see the full picture. Otherwise, the analysis will be incomplete.
This blog post shows an example of a code for which static analysis will be incomplete unless multiple files are analyzed together. More specifically, you will see how Parallelware Analyzer’s pwloops won’t find any parallelization opportunity for a matrix multiplication loop unless you instruct it to also analyze the source file where a function called from that loop is defined. You will learn how to do this in Parallelware Analyzer through a configuration file.
A matrix multiplication example organized into multiple files
Create a main.c source file with the following contents:
#include <stdio.h> #include "matmul.h" // C (m x n) = A (m x p) * B (p x n) void matmul(size_t m, size_t n, size_t p, double *A, double *B, double *C) { for (size_t i = 0; i < m; i++) { for (size_t j = 0; j < n; j++) { for (size_t k = 0; k < p; k++) { C[i * n + j] += multiply(i, j ,k, n, p, A, B); } } } } int main(int argc, char *argv[]) { size_t param_n = 500; size_t rows = param_n, cols = param_n; // Allocates input and output matrices (dynamic one-dimensional arrays) double *in1_mat = (double *)malloc(rows * cols * sizeof(double)); double *in2_mat = (double *)malloc(rows * cols * sizeof(double)); double *out_mat = (double *)malloc(rows * cols * sizeof(double)); if (!in1_mat || !in2_mat || !out_mat) { printf("Error: not enough memory to run the test using n = %li\n", param_n); return 1; } for (size_t i = 0; i < rows * cols; i++) { out_mat[i] = 0; in1_mat[i] = rand() % 10; in2_mat[i] = rand() % 10; } matmul(rows, cols, cols, in1_mat, in2_mat, out_mat); // Compute and print the checksum double checksum = 0.0; for (size_t i = 0; i < rows * cols; i++) checksum += out_mat[i]; printf("checksum = %.0f\n", checksum); // Release allocated resources free(in1_mat); free(in2_mat); free(out_mat); return 0; }
As you can see, this file includes a local header file called matmul.h. Create it now with the following contents:
#include <stdlib.h> double multiply(size_t i, size_t j, size_t k, size_t n, size_t p, double *A, double *B);
The header file is only declaring the function signature (its name, return type and parameters). The actual function definition (its body) is defined in another source file called matmul.c. Create it with the following contents:
#include "matmul.h" double multiply(size_t i, size_t j, size_t k, size_t n, size_t p, double *A, double *B) { return A[i * p + k] * B[k * n + j]; }
Building the matrix multiplication example
Now you have the three source files forming our matrix multiplication example: main.c, matmul.h and matmul.c. Try to build the program by only compiling the main.c file:
$ gcc main.c /usr/bin/ld: /tmp/ccHb0MKL.o: in function `matmul': main.c:(.text+0x74): undefined reference to `multiply' collect2: error: ld returned 1 exit status
The build fails because the linker could not find the definition of the multiply function called from matmul. Try again adding the missing source file:
$ gcc main.c matmul.c $ ./a.out checksum = 2535710866
Now the program has been successfully built. Since we haven’t specified an output binary name, the binary is called a.out.
Using Parallelware Analyzer to search for parallelization opportunities
Parallelware Analyzer consists of several command-line tools covering different parts of the parallelization workflow. Today, we will use pwloops which provides information about opportunities for parallelization found in the code. However, it is recommended to always start by invoking the pwreport tool which provides a high-level overview of your code as well as hints on what to do next. Give it a try by analyzing the main.c file:
$ pwreport main.c 1 file successfully analyzed and 0 failures in 65 ms CODE COVERAGE Analyzable files: 1 / 1 (100 %) Analyzable functions: 0 / 8 (0 %) Analyzable loops: 0 / 5 (0 %) Parallelized SLOCs: 0 / 73 (0%) SUMMARY Total defects: 0 Total recommendations: 2 Total opportunities: 0 Total data races: 0 Total data-race-free: 0 SUGGESTIONS 5 loops could not be analyzed, get more information with pwloops: pwloops --non-analyzable main.c 2 recommendations were found in your code, get more information with pwcheck: pwcheck --only-recommendations main.c
Note that although the main.c file was successfully analyzed, no useful information was reported. Parallelware Analyzer knows that there are 8 functions and 5 loops in the code, but not a single one of them could be analyzed. Indeed, if you invoke pwloops, all you will get is an empty listing of loops, as shown below (notice the first column which reports that none is analyzable):
$ pwloops main.c Loop Analyzable Compute patterns Opportunity Auto-Parallelizable Parallelized ------------------ ---------- ---------------- ----------- ------------------- ------------ main.c:matmul:7:4 main.c:matmul:8:8 main.c:matmul:9:12 main.c:main:29:4 main.c:main:39:4 …
Take a look again at the previous pwreport output and try the suggestion it provided:
$ pwloops --non-analyzable main.c main.c:9:12 9: for (size_t k = 0; k < p; k++) { 10: C[i * n + j] += multiply(i, j ,k, n, p, A, B); 11: } ISSUES 10:32: Call to an undefined function 'multiply' can not be safely dimmed as parallelizable …
As you can see, pwloops reports the issues preventing the analysis from succeeding. All three nested loops within the matmul function share the same issue: the multiply function called from them is undefined and thus, Parallelware Analyzer can not completely analyze the loops. In order for a loop to be classified as an opportunity for parallelization, Parallelware Analyzer must be sure that it is safe to do so. Without access to all the involved source code files, this is not possible.
Specifying source file dependencies through a configuration file
Parallelware Analyzer supports specifying source file dependencies using a configuration file. Let’s create a very simple one to state that when analyzing main.c, Parallelware Analyzer should also analyze matmul.c in order to have access to the multiply function definition. Create a pw.json file with the following contents:
{ "version": 2, "analyses": [ { "match": "main.c", "dependencies": [ "matmul.c" ] } ] }
You can specify the file dependencies of all the files of a project by adding a similar entry for each one of them. An alternative way is to take advantage of the gitignore pattern matching syntax supported in the Parallelware Analyzer configuration file.The example below shows the use of a wildcard in the match field so that main.c and matmul.c are always analyzed:
{ "version": 2, "analyses": [ { "match": "*", "dependencies": [ "main.c", "matmul.c" ] } ] }
Choose either one of the configuration files and pass it Parallelware Analyzer command-line tools by using the –config parameter. Invoke again pwreport, this time using the configuration file:
$ pwreport main.c --config pw.json 1 file successfully analyzed and 0 failures in 44 ms CODE COVERAGE Analyzable files: 1 / 1 (100 %) Analyzable functions: 2 / 15 (13.33 %) Analyzable loops: 3 / 5 (60 %) Parallelized SLOCs: 0 / 107 (0 %) SUMMARY Total defects: 0 Total recommendations: 2 Total opportunities: 1 Total data races: 0 Total data-race-free: 0 SUGGESTIONS 2 loops could not be analyzed, get more information with pwloops: pwloops --non-analyzable main.c --config pw.json 2 recommendations were found in your code, get more information with pwcheck: pwcheck --only-recommendations main.c --config pw.json 1 opportunity for parallelization was found in your code, get more information with pwloops: pwloops main.c --config pw.json
Now that it has access to the multiply function definition, pwreport is able to report useful information, particularly 1 parallelization opportunity. Next, copy the suggestion to invoke pwloops to get detailed information about those opportunities for parallelization:
$ pwloops main.c --config pw.json Loop Analyzable Compute patterns Opportunity Auto-Parallelizable .. ------------------ ---------- ---------------- ----------- ------------------- .. main.c:matmul:7:4 x sparse multi x main.c:matmul:8:8 x forall main.c:matmul:9:12 x n/a main.c:main:29:4 main.c:main:39:4 Loop : loop name following the syntax <file>:<function>:<line>:<column> Analyzable : all C/C++/Fortran language features present in the loop are supported by Parallelware Compute patterns : compute patterns found in the loop ('forall', 'scalar' or 'sparse' reduction, 'recurrence') Opportunity : whether the loop is a parallelization opportunity and for which paradigms ('multi' for multi-threading or 'simd' for vectorization) Auto-Parallelizable : loop can be parallelized by Parallelware Parallelized : loop is already parallelized, for instance with OpenMP or OpenACC directives SUGGESTIONS Get more details about the data scoping of each variable within a loop, e.g.: pwloops --datascoping --loop main.c:main:39:4 main.c --config pw.json Print the code annotated with opportunities, e.g.: pwloops --code --function main.c:main main.c --config pw.json Parallelize an auto-parallelizable loop, e.g.: pwdirectives main.c:matmul:7:4 -o <output_file> Find out what prevented the analysis of a loop, e.g.: pwloops --non-analyzable --loop main.c:main:29:4 main.c --config pw.json 1 file successfully analyzed and 0 failures in 41 ms
Now you get a summary of the computational pattern found for each loop and what kind of parallelization opportunities it qualifies for. You’ve also got suggestions on which analyses could be of your interest as next steps. Feel free to give them a try to keep exploring Parallelware Analyzer!
Summary
The Parallelware static code analysis technology exploited by Parallelware Analyzer needs access to all the relevant parts of the code involved in a computation in order to be able to report useful information about it. In many cases, a function is defined in a file and called from a different one, thus requiring interprocedural analysis across files: Parallelware Analyzer must analyze both source code files together to see the whole picture of what is going on. This is an intrinsic problem of all static code analysis tools. In order to overcome it, this blog post has shown how Parallelware Analyzer can be easily set up to successfully analyze source code file dependencies through a very simple JSON configuration file. For more information, you can find the configuration file reference in the docs subfolder of your Parallelware Analyzer installation.
More information
- Parallelware Analyzer NPB Quickstart
- Fixing defects in parallel code: an OpenMP example
- Defects and recommendations for parallelism
- Checks reference index
- Parallel Patterns reference
- Parallelware Analyzer product page
Join the Early Access of Parallelware Analyzer
Register for free access to the latest versions and support until the official release.
Leave a Reply