Cause:
A summation operation or reduction can be vectorized by breaking it up into a separate partial sum for each vector lane, and then adding together the partial sums at the end. Because this changes the order in which the individual contributions to the sum are added together, and hence the effects of rounding to floating-point precision, the final result may differ slightly compared to a scalar floating-point reduction, though it is not necessarily any less accurate. For some purposes, there may be a need to obtain identical floating-point results, independent of optimization level. The switch -fp-model precise or -fp-model source, (equivalent for Fortran), achieve this by disabling those optimizations which might lead to slight variations in floating-point rounding effects. (See http://software.intel.com/en-us/articles/consistency-of-floating-point-results-using-the-intel-compiler/ for more detail). Vectorization of floating-point reductions is one such optimization that is disabled under switches such as -fp-model precise, -fp-model source or -fp-model restrict, with the message "modifying order of operation not allowed under given switches".
Example:
real function sum(x,n) implicit none real, intent(in), dimension(n) :: x integer, intent(in ) :: n integer :: i sum=0. do i=1,n sum = sum + x(i) enddo end function sum
Resolution:
real function sum(x,n) implicit none real, intent(in), dimension(n) :: x integer, intent(in ) :: n integer :: i sum=0. !dir$ simd reduction(+:sum) ! or !$omp simd reduction(+:sum) do i=1,n sum = sum + x(i) enddo end function sum
> ifort -c -vec-report2 -fp-model source d_15033_2.f90
d_15033_2.f90(9): (col. 4) remark: SIMD LOOP WAS VECTORIZED