Programs are rarely built from a single source code file. Normally, code is organized into several source files, reaching hundreds or even thousands of files for large programs. Static code analysis tools work by analyzing the source code files. When a function defined in a source file calls another function defined in a different source file, static analysis will inspect the bodies of both the caller function and the called function (and do the same for any other functions called from there). This is called interprocedural analysis across source code files. Obviously, for it to work, static code analysis tools must have access to all of the source code files where the functions are defined.

Obvious as it is, this is not always simple. Think for instance in the way programs are built in the C programming language: each source file is compiled individually and calls to functions defined in other source files are resolved during the linking process. All a source file needs to know about called functions is their declaration. A function declaration serves as a function signature and is normally located in header files. By including these header files, the compiler has enough information to compile a source file calling one of those functions. The actual function definition containing the function body will be located in some other source file and the compiler is not concerned with it: it will be compiled separately and it won’t be until linkage that undefined function errors may arise.
As stated before, static code analysis tools need access to the function definitions in order for interprocedural analysis to succeed. Thus, unlike compilers, they need to analyze several source code files together to see the full picture. Otherwise, the analysis will be incomplete.
This blog post shows an example of a code for which static analysis will be incomplete unless multiple files are analyzed together. More specifically, you will see how Parallelware Analyzer’s pwloops
won’t find any parallelization opportunity for a matrix multiplication loop unless you instruct it to also analyze the source file where a function called from that loop is defined. You will learn how to do this in Parallelware Analyzer through a configuration file.
A matrix multiplication example organized into multiple files
Create a main.c
source file with the following contents:
#include <stdio.h>
#include "matmul.h"
// C (m x n) = A (m x p) * B (p x n)
void matmul(size_t m, size_t n, size_t p, double *A, double *B, double *C) {
for (size_t i = 0; i < m; i++) {
for (size_t j = 0; j < n; j++) {
for (size_t k = 0; k < p; k++) {
C[i * n + j] += multiply(i, j ,k, n, p, A, B);
}
}
}
}
int main(int argc, char *argv[]) {
size_t param_n = 500;
size_t rows = param_n, cols = param_n;
// Allocates input and output matrices (dynamic one-dimensional arrays)
double *in1_mat = (double *)malloc(rows * cols * sizeof(double));
double *in2_mat = (double *)malloc(rows * cols * sizeof(double));
double *out_mat = (double *)malloc(rows * cols * sizeof(double));
if (!in1_mat || !in2_mat || !out_mat) {
printf("Error: not enough memory to run the test using n = %li\n", param_n);
return 1;
}
for (size_t i = 0; i < rows * cols; i++) {
out_mat[i] = 0;
in1_mat[i] = rand() % 10;
in2_mat[i] = rand() % 10;
}
matmul(rows, cols, cols, in1_mat, in2_mat, out_mat);
// Compute and print the checksum
double checksum = 0.0;
for (size_t i = 0; i < rows * cols; i++)
checksum += out_mat[i];
printf("checksum = %.0f\n", checksum);
// Release allocated resources
free(in1_mat);
free(in2_mat);
free(out_mat);
return 0;
}
As you can see, this file includes a local header file called matmul.h
. Create it now with the following contents:
#include <stdlib.h>
double multiply(size_t i, size_t j, size_t k, size_t n, size_t p, double *A, double *B);
The header file is only declaring the function signature (its name, return type and parameters). The actual function definition (its body) is defined in another source file called matmul.c
. Create it with the following contents:
#include "matmul.h"
double multiply(size_t i, size_t j, size_t k, size_t n, size_t p, double *A, double *B)
{
return A[i * p + k] * B[k * n + j];
}
Building the matrix multiplication example
Now you have the three source files forming our matrix multiplication example: main.c
, matmul.h
and matmul.c
. Try to build the program by only compiling the main.c
file:
$ gcc main.c
/usr/bin/ld: /tmp/ccHb0MKL.o: in function `matmul':
main.c:(.text+0x74): undefined reference to `multiply'
collect2: error: ld returned 1 exit status
The build fails because the linker could not find the definition of the multiply
function called from matmul
. Try again adding the missing source file:
$ gcc main.c matmul.c
$ ./a.out
checksum = 2535710866
Now the program has been successfully built. Since we haven’t specified an output binary name, the binary is called a.out
.
Using Parallelware Analyzer to search for parallelization opportunities
Parallelware Analyzer consists of several command-line tools covering different parts of the parallelization workflow. Today, we will use pwloops
which provides information about opportunities for parallelization found in the code. However, it is recommended to always start by invoking the pwreport
tool which provides a high-level overview of your code as well as hints on what to do next. Give it a try by analyzing the main.c
file:
CODE COVERAGE
Analyzable files: 1 / 1 (100.00 %)
Analyzable functions: 0 / 7 ( 0.00 %)
Analyzable loops: 1 / 5 ( 20.00 %)
Parallelized SLOCs: 0 / 0
METRICS SUMMARY
Total defects: 0
Total recommendations: 1
Total opportunities: 1
Total data races: 0
Total data-race-free: 0
SUGGESTIONS
4 loops could not be analyzed, get more information with pwloops:
pwloops --non-analyzable main.c
1 recommendation was found in your code, get more information with pwcheck:
pwcheck --only-recommendations main.c
1 opportunity for parallelization was found in your code, get more information with pwloops:
pwloops main.c
1 file successfully analyzed and 0 failures in 22 ms
Note that although the main.c
file was successfully analyzed, not much information was reported. Parallelware Analyzer knows that there are 8 functions and 5 loops in the code, but not a single one of them could be completely analyzed. Indeed, if you invoke pwloops
, you will see that only loop loop is analyzable, as shown below (notice the second column which reports whether a loop is analyzable):
$ pwloops main.c
Loop Analyzable Compute patterns Opportunity Auto-Parallelizable Parallelized
-------------------- ---------- ---------------- ----------- ------------------- ------------
main.c
|- matmul:7:4
| `- matmul:8:8
| `- matmul:9:12
|- main:29:4
`- main:39:4 x scalar simd, multi x
…
Take a look again at the previous pwreport
output and try the suggestion it provided:
$ pwloops --non-analyzable main.c
…
main.c:9:12
9: for (size_t k = 0; k < p; k++) {
10: C[i * n + j] += multiply(i, j ,k, n, p, A, B);
11: }
ISSUES
10:32: [ ERROR ] Potential pointer aliasing for variable 'A'
10:32: [ ERROR ] Potential pointer aliasing for variable 'B'
10:32: [ ERROR ] Call to an undefined function 'multiply' can not be safely dimmed as parallelizable
…
As you can see, pwloops reports the issues preventing the analysis from succeeding. All three nested loops within the matmul
function (lines 7, 8 and 9) share the same issue: the multiply
function called from them is undefined and thus, Parallelware Analyzer can not completely analyze the loops. In order for a loop to be classified as an opportunity for parallelization, Parallelware Analyzer must be sure that it is safe to do so. Without access to all the involved source code files, this is not possible.
Specifying source file dependencies through a configuration file
Parallelware Analyzer supports specifying source file dependencies using a configuration file. Let’s create a very simple one to state that when analyzing main.c
, Parallelware Analyzer should also analyze matmul.c
in order to have access to the multiply
function definition. Create a pw.json
file with the following contents:
{
"version": 2,
"analyses": [
{
"match": "main.c",
"dependencies": [
"matmul.c"
]
}
]
}
You can specify the file dependencies of all the files of a project by adding a similar entry for each one of them. An alternative way is to take advantage of the gitignore
pattern matching syntax supported in the Parallelware Analyzer configuration file.The example below shows the use of a wildcard in the match
field so that main.c
and matmul.c
are always analyzed:
{
"version": 2,
"analyses": [
{
"match": "*",
"dependencies": [
"main.c",
"matmul.c"
]
}
]
}
Choose either one of the configuration files and pass it Parallelware Analyzer command-line tools by using the --config
parameter. Invoke again pwreport,
this time using the configuration file:
$ pwreport main.c --config pw.json
1 file successfully analyzed and 0 failures in 44 ms
CODE COVERAGE
Analyzable files: 1 / 1 (100 %)
Analyzable functions: 2 / 15 (13.33 %)
Analyzable loops: 3 / 5 (60 %)
Parallelized SLOCs: 0 / 107 (0 %)
SUMMARY
Total defects: 0
Total recommendations: 2
Total opportunities: 1
Total data races: 0
Total data-race-free: 0
SUGGESTIONS
2 loops could not be analyzed, get more information with pwloops:
pwloops --non-analyzable main.c --config pw.json
2 recommendations were found in your code, get more information with pwcheck:
pwcheck --only-recommendations main.c --config pw.json
1 opportunity for parallelization was found in your code, get more information with pwloops:
pwloops main.c --config pw.json
Now that it has access to the multiply
function definition, pwreport
is able to report useful information, particularly 1 parallelization opportunity. Next, copy the suggestion to invoke pwloops
to get detailed information about those opportunities for parallelization:
$ pwloops main.c --config pw.json
Loop Analyzable Compute patterns Opportunity Auto-Parallelizable ..
------------------ ---------- ---------------- ----------- ------------------- ..
main.c:matmul:7:4 x sparse multi x
main.c:matmul:8:8 x forall
main.c:matmul:9:12 x n/a
main.c:main:29:4
main.c:main:39:4
Loop : loop name following the syntax <file>:<function>:<line>:<column>
Analyzable : all C/C++/Fortran language features present in the loop are supported by Parallelware
Compute patterns : compute patterns found in the loop ('forall', 'scalar' or 'sparse' reduction, 'recurrence')
Opportunity : whether the loop is a parallelization opportunity and for which paradigms ('multi' for multi-threading or 'simd' for vectorization)
Auto-Parallelizable : loop can be parallelized by Parallelware
Parallelized : loop is already parallelized, for instance with OpenMP or OpenACC directives
SUGGESTIONS
Get more details about the data scoping of each variable within a loop, e.g.:
pwloops --datascoping --loop main.c:main:39:4 main.c --config pw.json
Print the code annotated with opportunities, e.g.:
pwloops --code --function main.c:main main.c --config pw.json
Parallelize an auto-parallelizable loop, e.g.:
pwdirectives main.c:matmul:7:4 -o <output_file>
Find out what prevented the analysis of a loop, e.g.:
pwloops --non-analyzable --loop main.c:main:29:4 main.c --config pw.json
1 file successfully analyzed and 0 failures in 41 ms
Now you get a summary of the computational pattern found for each loop and what kind of parallelization opportunities it qualifies for. You’ve also got suggestions on which analyses could be of your interest as next steps. Feel free to give them a try to keep exploring Parallelware Analyzer!
Summary
The Parallelware static code analysis technology exploited by Parallelware Analyzer needs access to all the relevant parts of the code involved in a computation in order to be able to report useful information about it. In many cases, a function is defined in a file and called from a different one, thus requiring interprocedural analysis across files: Parallelware Analyzer must analyze both source code files together to see the whole picture of what is going on. This is an intrinsic problem of all static code analysis tools. In order to overcome it, this blog post has shown how Parallelware Analyzer can be easily set up to successfully analyze source code file dependencies through a very simple JSON configuration file. For more information, you can find the configuration file reference in the docs subfolder of your Parallelware Analyzer installation.
More information
- Parallelware Analyzer NPB Quickstart
- Fixing defects in parallel code: an OpenMP example
- Defects and recommendations for parallelism
- Checks reference index
- Parallel Patterns reference
- Parallelware Analyzer product page
Join Parallelware Analyzer Early Access
Enter the program to have access to all versions of Parallelware Analyzer until the official release.
Leave a Reply