This is the eighth video of our OpenACC programming course based on Appentra’s unique, productivity-oriented approach to learn best practices for parallel programming using OpenACC. Learning outcomes include how to decompose codes into parallel patterns and a practical step-by-step process based on patterns for parallelizing any code. In this video we will learn about the first pattern that is a common code component in scientific software: the forall.
✉ Subscribe to our newsletter and get all of our latest updates.
Course Index
- Course overview
- What is OpenACC?
- How OpenACC works.
- Building and running an OpenACC code.
- The OpenACC parallelization process.
- Using your first OpenACC directives.
- Using patterns to parallelize.
- Identifying a forall patterns.
- Implementing a parallel forall.
Video transcript
Identifying a forall pattern
We will now learn about the first pattern that is a common code component in scientific software: the forall.
Identification of code patterns
The forall is one of four code patterns that you will learn about in this course, learn how to identify, the control flow and access patterns of the key variables that help define the pattern.
Forall
A forall is the simplest type of pattern that is already intrinsically parallel, that is, the operations can be split amongst processing elements and execute in parallel, without changing the accuracy of the result.
Here we see a simplified forall pattern, very similar to what we saw in the D-A-X-P-Y code example.
for (j=0; j<n; j++) {
A[j] = B[j];
}
The key code components to identify are:
- Look for a loop where array elements are updated. This may be all or part of the array, but the array access must be regular, whether they increment one at a time or by a larger step.
- Each iteration updates a different element of the array.
- Finally, once the loop iterations are finished, the result is computed in array A, also defined as the ‘output variable’ of the computation
To see an example of this in a real code, lets return to the D-A-X-P-Y code example before it was parallelized in the previous lecture (in Parallelware Trainer). Here we can see the routine we parallelized previously. This contains the key features of a forall pattern, a regular for loop where each iteration is independent of all other iterations as the calculation on line 11 does not depend on any previous or future for loop iterations.
for {int i = 0; i <n; i++) {
D[i] = a * X[i] + Y[i];
}
Once identified, to parallelize this pattern by accelerating in OpenACC, the parallel loop solution is used. This will be discussed in the next video.
Resources
✉ Subscribe to our newsletter and get all of our latest updates.
Leave a Reply