Quantcast
Channel: Intel Developer Zone Articles
Viewing all articles
Browse latest Browse all 3384

Asynchronous Offload - Fortran Code Examples

$
0
0

This document provides information about asynchronous data transfer, asynchronous computation and memory management without data transfer. This document includes code examples of common usage scenarios. The examples in this article are in Fortran only.

Introduction

Two different Fortran directives are used for data transfer and wait for completion.
The directive for data transfer only, with asynchronous option is:

!dir$ offload_transfer <clauses> [ signal(<tag>) ]

The directive to wait for completion of asynchronous activity is

!dir$ offload_wait <clauses> wait(<tag>)

The offload directive also takes optional signal and wait clauses

!dir$ offload <clauses> [ signal(<tag>) ] [ wait(<tag>) ]<statement>

The offload_transfer and offload_wait directives are stand-alone and do not apply to the subsequent code block.

Data Transfer

The offload_transfer directive is a stand-alone directive, meaning that no statement succeeds it. This directive contains a target clause and either all in clauses, or all out clauses. Without a signal clause, offload_transfer initiates and completes a synchronous data transfer. With a signal clause, initiates the data transfer only. The offload_transfer directive can also take a wait clause. A later directive with wait clause is used to wait for data transfer completion.
Expressions in the signal and wait clauses are address-sized values that serve as tags on the asynchronous operation.

! Example 1
! Synchronous data transfer CPU -> MIC
! Next statement executed after data transfer is completed
!dir$ offload_transfer target(mic:0) in(a,b,c)

! Example 2
! Initiate asynchronous data transfer CPU -> MIC
!dir$ offload_transfer target(mic:0) in(a,b,c) signal(s)

The offload_wait directive is also a stand-alone directive which does not require a succeeding statement. This directive contains a target clause and a wait clause, which cause the directive to start execution only after the asynchronous activity associated with the tag has completed.

! Example 3
! Wait for activity signaled by &p to be completed. Variable p is the tag.
!dir$ offload_wait target(mic:0) wait(s)

Memory Management

The offload_transfer directive can be used for memory allocation and deallocation by avoiding the data transfer with the use of the nocopy clause. This is typically done outside of a loop to amortize cost of allocation.

! Example 4
#define ALLOC alloc_if(.TRUE.) free_if(.FALSE.)
#define FREE alloc_if(.FALSE.) free_if(.TRUE.)
#define REUSE alloc_if(.FALSE.) free_if(.FALSE.)
! Allocate memory on the coprocessor  (without also transferring data)
!dir$ offload_transfer target(mic:0) nocopy(p,q: ALLOC)
do …
    ! Use of allocated memory on the coprocessor for offloads
    !dir$ offload target(mic:0) in(p: REUSE) out(q: REUSE)
    ! computation using p and q
enddo
…
! Free memory on the coprocessor (without also transferring data)
!dir$ offload_transfer target(mic:0) nocopy(p,q: FREE)

Send Input Data Asynchronously

The most typical usage initiates the data transfer, executes some CPU activity, then starts the offload computation that will use the transferred data. The data is placed in the same variables listed in the transfer initiation. Those variables must be accessible by the time the offload directive begins execution.

! Example 5
! Initiate asynchronous data transfer MIC -> CPU
!dir$ offload_transfer target(mic:0) in(p,q,r) signal(s)
…
…
! Do the offload only after data has arrived
!dir$ offload target(mic:0) wait(s)
! offload computation
… = p

Receive Output Asynchronously

In asynchronous offload, an offload computation produces results that will be transferred back to the host at a later time. The offload directive finishes the work but does not immediately copy the data back. Instead, an asynchronous offload_transfer initiates the copy. Later, when results are needed, an offload_wait is used to retrieve the data.

! Example 6a
! Perform  the offload computation but don’t copy back results immediately
!dir$ offload target(mic:0) nocopy(p)
! offload computation
…
! Initiate asynchronous data transfer MIC -> CPU
!dir$ offload_transfer target(mic:0) out(p) signal(s)
…
…
! Wait for data to arrive
!dir$ offload_wait target(mic:0) wait(s)

Asynchronous Computation

The host initiates an offload to be performed asynchronously and can proceed to next statement after starting this computation. Later in the code, an offload_wait directive is used to wait for completion of the offload activity.

! Example 6b
character :: signal_var
integer, allocatable, dimension :: p
do …
! Initiate asynchronous computation
!dir$ offload … in(p) signal(signal_var)
    call mic_compute();
call concurrent_cpu_activity();
!dir$ offload_wait (signal_var);
enddo

Testing Signals

Some scenarios require testing to determine whether the computation signaled with a given tag is finished. Use the Offload_signaled function (non-blocking mechanism) to check if an offload has completed.

! Example 7
! Initiate asynchronous computation
program prog
use mic_lib
implicit none

integer :: c
!dir$ offload target(mic:mic_no) signal(c)
    ! offload computation statement
    ...
! Test if computation has been completed
if (Offload_signaled(mic_no, c) /= 0) then
    …
endif

Double-buffering

Use the offload, offload_transfer and offload_wait directives to implement a double-buffering algorithm. The example below shows memory allocation on the target device, asynchronous data transfers, the use of signal clauses to control asynchronous offloads.

! Example 8 - Double-buffering Input
subroutine do_async_in()
    integer :: i
    !dir$ offload_transfer target(mic:0) in(in1: REUSE) signal(sig1)
    do i=1, iter
        if (MOD(i, 2) == 1) then
            ! Odd numbered iterations
            !dir$ offload_transfer target(mic:0) if(i /= iter) in(in2: REUSE) signal(sig2)
            !dir$ offload target(mic:0) nocopy(in1) wait(sig1)  out(out1: REUSE)
            call compute(in1, out1);
        else
            !dir$ offload_transfer target(mic:0) if(I /= iter) in(in1: REUSE ) signal(sig1)
            !dir$ offload target(mic:0) nocopy(in2) wait(sig2) out(out2: REUSE)
            call compute(in2, out2);
        endif
    enddo
end subroutine

! Example 8 - Double-buffering Output
subroutine do_async_out()
    integer :: i
    do i=1, iter
        if(MOD(i, 2) ==1) then ! Odd numbered iterations
            if (i<iter) then ! all iterations except the last
                !dir$ offload target(mic:0) in(in1: REUSE) nocopy(out1)
                    call compute(in1, out1)
                !dir$ offload_transfer target(mic:0) out(out1: REUSE) signal(sig1)
             end if
             if (i>1) then ! all iterations except the first
                 !dir offload_wait target(mic:0) wait(sig2)
                 call use_result(out2)
             endif
         else ! even numbered iterations
             if(i < iter) then ! all iterations except the last
                 !dir$ offload target(mic:0) in(in2:REUSE) nocopy(sig2)
                 call compute(in2, out2)
                 !dir$ offload_transfer target(mic:0) out(out2:REUSE) signal(sig2)
             endif
            if(i > 1) then ! all iterations except the first
                !dir$ offload_wait target(mic:0) wait(sig1)
                call use_result(out1)
            endif
        endif
    enddo
end subroutine

Summary

Asynchronous offload allows data transfer and computation to overlap. This method does not require the use of additional threads on the host and is useful for pipelined operations. Refer to the following sample code installed with the Intel® Fortran Compiler for more details (default installation directory):

  • Linux*:  /opt/intel/composer_xe_2015/Samples/en_US/Fortran/mic_samples/LEO_Fortran_intro
  • Windows*:  C:\Program Files (x86)\Intel\Composer XE 2015\Samples\en_US\Fortran

Viewing all articles
Browse latest Browse all 3384

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>