Cause:
A function call inside the loop is preventing auto-vectorization.
Example:
Program foo implicit none integer, parameter :: nx = 100000000 real(8) :: x, xp, sumx integer :: i interface real(8) function bar(x, xp) real(8), intent(in) :: x, xp end end interface sumx = 0. xp = 1. do i = 1,nx x = 1.D-8*real(i,8) sumx = sumx + bar(x,xp) enddo print *, 'Sum =',sumx end real(8) function bar(x, xp) implicit none real(8), intent(in) :: x, xp bar = 1. - 2.*(x-xp) + 3.*(x-xp)**2 - 1.5*(x-xp)**3 + 0.2*(x-xp)**4 bar = bar / sqrt(x**2 + xp**2) end
LOOP BEGIN at foo.f90(18,5)
remark #15543: loop was not vectorized: loop with function call not considered an optimization candidate. [ foo.f90(17,22) ]
LOOP END
Resolution:
real(8) function bar(x, xp) !$OMP DECLARE SIMD (bar) UNIFORM(xp) implicit none real(8), intent(in) :: x, xp bar = 1. - 2.*(x-xp) + 3.*(x-xp)**2 - 1.5*(x-xp)**3 + 0.2*(x-xp)**4 bar = bar / sqrt(x**2 + xp**2) end
...
remark #15301: FUNCTION WAS VECTORIZED [ bar.f90(1,18) ]
remark #15344: loop was not vectorized: vector dependence prevents vectorization. First dependence is shown below. Use level 5 report for details
remark #15346: vector dependence: assumed OUTPUT dependence between line 17 and line 18
LOOP END
Program foo implicit none integer, parameter :: nx = 100000000 real(8) :: x, xp, sumx integer :: i interface real(8) function bar(x, xp) !$OMP DECLARE SIMD (bar) UNIFORM(xp) real(8), intent(in) :: x, xp end end interface sumx = 0. xp = 1. !$OMP SIMD private(x) reduction(+:sumx) do i = 1,nx x = 1.D-8*real(i,8) sumx = sumx + bar(x,xp) enddo print *, 'Sum =',sumx end
...
LOOP BEGIN at foo.f90(17,5)
remark #15301: OpenMP SIMD LOOP WAS VECTORIZED
LOOP END
The loop is now vectorized successfully; running and timing the program shows a speedup.
Note that if the DECLARE SIMD directive is omitted, the !$OMP SIMD directive will still cause the remaining parts of the loop in foo to be vectorized, but the call to bar() will be serialized, so any performance gain is likely to be small. In either case, the private and reduction clauses of this directive are mandatory; without them, the compiler will assume no loop-carried dependencies and results may be incorrect.
remark #15300: LOOP WAS VECTORIZED
LOOP END