Diagnostic 15134: vectorization support: reference xxxx has unaligned access (Fortran)

Cause:

The vectorizer cannot safely use aligned loads or stores for this data access, either because the data are not aligned to an n-byte boundary in memory, or because the compiler does not know the alignment. The compiler must use unaligned memory accesses, which may be less efficient. The value of n depends on the targeted instruction set and corresponds to the width of the vector instructions: 16 for Intel® SSE, 32 for Intel® AVX and 64 for Intel® AVX-512 instructions.

Example:

subroutine d_15134(x,y,z,index,m1,m2,mm)
  implicit none
  real, dimension(m1,m2), intent(in ) :: x,y
  real, dimension(m1,m2), intent(out) :: z
  integer, dimension(m2), intent(in ) :: index
  integer,                intent(in ) :: m1, m2
  integer                             :: i, j

!!dir$ assume_aligned x:32, y:32, z:32 
!!dir$ assume (mod(m1,8).eq.0)

  do j=1,m2
    do i=1,m1-1
      z(i,j) = x(i,index(j)) + x(i,j)*y(i+1,j)
    enddo
  enddo
   
end subroutine d_15134

>ifort -c -xavx -vec-report6 d_15134.f90

d_15134.f90(14): (col. 7) remark: vectorization support: reference z has unaligned access

d_15134.f90(14): (col. 7) remark: vectorization support: reference x has unaligned access

d_15134.f90(14): (col. 7) remark: vectorization support: reference y has unaligned access

d_15134.f90(14): (col. 7) remark: vectorization support: unaligned access used inside loop body

d_15134.f90(13): (col. 5) remark: vectorization support: unroll factor set to 2

d_15134.f90(13): (col. 5) remark: LOOP WAS VECTORIZED

d_15134.f90(12): (col. 3) remark: loop was not vectorized: not inner loop

You must compile with -vec-report6 with the Intel Compiler version 14.0 to get the alignment diagnostics. Without the aid of directives, the compiler does not know the alignment of the arrays x, y and z, and does not know the extent of the leading dimension. It assumes that any memory access could be unaligned.

There are 3 main issues:

The compiler does not know the absolute alignment, for example of z in the above code sample. It can often correct for this at run-time, by peeling of some loop iterations, so that the loop kernel starts at a point where accesses to the first array are aligned.
The compiler does not know the alignment of other arrays relative to the first, for example of x relative to z. If the compiler peels to align accesses to z, that may not align accesses to x. This can sometimes be worked around at run-time by generating two versions of the loop kernel, one where x and z have the same alignment, and one where they do not. But for larger numbers arrays, the compiler cannot generate kernel versions corresponding to all the possible combinations of alignment.
Even if array accesses in the inner loop are aligned for the first iteration of the outer loop over j, they will not be aligned for subsequent values of j, (i.e., other columns of the matrix will not be aligned), unless the size of the first dimension (the column length) of x, y and z is a multiple of the vector width n. For single precision, m1 needs to be a multiple of 4 for Intel SSE, 8 for Intel AVX or 16 for Intel AVX-512 instructions.

Resolution:

If the arrays x, y and z are aligned in the routine in which they are first declared, e.g. by using a switch such as -align array32byte or an ATTRIBUTES ALIGN directive, then other directives can be used to assert that alignment to the compiler in routines where the arrays are used. In the above example, the first directive asserts that the arrays x, y and z are always aligned on (at least) 32 byte boundaries in memory. The second directive asserts that m1, the extent of the first array dimension, is a multiple of 8. If these directives are uncommented, the inner loop is vectorized using mostly aligned memory accesses:

> ifort -c -xavx -vec-report6 d_15134.f90

d_15134.f90(14): (col. 7) remark: vectorization support: reference z has aligned access

d_15134.f90(14): (col. 7) remark: vectorization support: reference x has aligned access

d_15134.f90(14): (col. 7) remark: vectorization support: reference y has unaligned access

d_15134.f90(14): (col. 7) remark: vectorization support: unaligned access used inside loop body

d_15134.f90(13): (col. 5) remark: vectorization support: unroll factor set to 2

d_15134.f90(13): (col. 5) remark: LOOP WAS VECTORIZED

d_15134.f90(12): (col. 3) remark: loop was not vectorized: not inner loop

Because the accesses to array y are offset by 1 element compared to accesses to x and z, the access to y must remain unaligned when the accesses to x and z are aligned. If that had not been the case, the two directives above could have been replaced by a single directive, !DIR$ VECTOR ALIGNED, which would assert that all memory accesses in the loop were aligned, (here, to a 32 byte boundary, since we are compiling with -xavx). Care must be taken when using such alignment directives. Invalid assertions of alignment may lead to poor performance, incorrect results or to a run-time error, depending on the context. However, careful alignment of data and ensuring the compiler knows the alignment can lead to improved performance.

Back to the list of vectorization diagnostics for Intel Fortran

Diagnostic 15134: vectorization support: reference xxxx has unaligned access (Fortran)

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112