Quantcast
Channel: Intel Developer Zone Articles
Viewing all articles
Browse latest Browse all 3384

Analyze and Optimize Windows* Game Applications Using Intel® INDE Graphics Performance Analyzers (GPA)

$
0
0

Download  Intel INDE Graphics Performance Analyzer.pdf

Intel® INDE Graphics Performance Analyzers (GPA) are powerful, agile tools enabling game developers to utilize the full performance potential of their gaming platform. GPA visualizes performance data from your application, enabling you to understand system-level and individual frame performance issues, as well as allowing you to perform “what-if” experiments to estimate potential performance gains from optimizations. GPA tools are available as part of the Intel® INDE tool suite or as standalone from here.

This article describes the GPA tools and walks through a sample game application for Windows*, showing individual frame performance issues and optimizing with the Graphics Frame Analyzer for DirectX*.

Graphics Monitor

Graphics Monitor is used to view, graph, and configure metrics in-game.  You can also take trace and frame captures as well as enable graphics pipeline overrides and experiments in real-time.

The sample game application used in this article (CitiRacer.exe) comes as part of the installation and is used as the example throughout this article.

Once you download and install GPA (see link above), click Analyze Application as shown below, and the Analyze Application window opens.


1. Graphics monitor


2 Analyze Application window to launch the game

Click the Run button, and you can start analyzing the application. The application automatically loads and displays the FPS (frames per second) as shown below. Press CTRL + F1 three times to see the screenshot shown below with different settings and metrics displayed.


3. Game running with all the metrics shown

Now we will take a trace capture of one particular frame and analyze it using the Graphics Frame Analyzer for DirectX tool that is installed with the Intel GPA toolkit. You can take the trace capture by pressing CTRL + SHIFT + C or using the System Analyzer tool that is described below.

System Analyzer

System Analyzer provides access to system-wide metrics for your game, including CPU, GPU, API, and the graphics driver. The metrics available vary depending on your platform, but you will find a large collection of useful metrics to help quantify key aspects of your application's use of system resources. In the System Analyzer you can also perform various "what-if" experiments to diagnose at a high level where your game's performance bottlenecks are concentrated.

If the System Analyzer finds that your game is CPU-bound, perform additional fine-tuning of your application using Platform Analyzer.

If the System Analyzer finds that you game is GPU-bound, use the Graphics Frame Analyzer for DirectX*/OpenGL* to drill down within a single graphics frame to pinpoint specific rendering problems, such as texture bandwidth, pixel shader performance, level-of-detail issues, or other bottlenecks within the rendering pipeline

Open System Analyzer, installed as part of Intel INDE.


4. Connecting using the System Analyzer

If the application you are analyzing is running on the same machine where Intel INDE is installed, click Connect or if the application is running on a remote machine enter the IP address of that machine and click Connect.

You will see the application in the System Analyzer as shown below.


5. Click the application to open the System Analyzer

Once the next screen opens, you can drag and drop the metrics that you are interested in. In this example, we are monitoring the Aggregated CPU Load, GPU duration, GPU Busy, and GPU frequency metrics. Press the CTRL key and drag multiple metrics simultaneously. Click the Camera button to capture a frame that’s taking more GPU and giving less FPS. We are going to capture this frame and analyze it using the Graphics Frame Analyzer for DirectX.


6. Capturing a frame using the System Analyzer

Analyzing a frame using the Intel® INDE Graphics Frame Analyzer for DirectX*

Once you open the Frame Analyzer, the captured frames will be automatically loaded. Select the latest frame that you captured and want to analyze and click Open.


7. Opening the captured frame with the Graphics Frame Analyzer for DirectX*

Now let’s start analyzing this particular frame that we captured.


8. Captured frame when opened with the Graphics Frame Analyzer for DirectX*

On the left-hand side RT0, RT1, RT2, RT3 are the render targets that are generated during this frame. Different games can have a different number of render targets used to build the whole frame and for this frame we have four render targets.

What we see on the graph below are the draw calls during that frame. They are called “ergs,” which is the scientific unit of measurement.


9. Graphical view of the ergs with GPU duration on X and Y Axes

You can filter the metrics that are shown. X and Y axes show GPU duration by default. You can change the X and Y axis metrics the dropdown. This is a quick glance that gives how long each erg takes on GPU and can quickly shows us the readings that show us to dive into the ergs that might need some optimization.

Right-click on RT1 and choose “Select ergs in this render target,” which highlights all the ergs used to generate this render target. You can analyze metrics on how long it took to generate the render target. An example of a render target is shown below.


10. Selecting all the ergs in the render target

Let’s dive further in to this render target. Click on the erg that takes the longest GPU duration to see the details of just this erg. If you click the Geometry tab, you can see what geometry is rendered as shown below. If we click the Shaders tab, it will show the vertex and fragment shaders for this erg.


11. Geometry rendered for the selected erg

Let’s explore the tabs at the bottom of the screen. “Selected” means the erg you have selected. If we select “Highlighted,” it shows the highlighted erg that corresponds to the Geometry we see on the right-hand side.

“Other” indicates all other ergs of the render target. Selecting “Hidden” means don’t show them at all. “Draw only to last selected” will draw for this render target only the ergs up to the erg we have selected. If we unselect it, all the ergs for this render target are shown.


12. How to highlight the selected erg


13. Highlighted erg shown in blue color

If you click the Texture tab, you can see what textures are bound with this erg. It is possible that all the textures may not be used in this same erg—they might have been used by a previous erg. But in general, the Texture tab shows what textures are used and how big they are. It’s a good way to find uncompressed images that may take more GPU duration, so we can go back and compress that particular texture.


14. Textures bound with this erg

Experiments

Now let’s talk about the Experiments tab. It allows you to override what the GPU does and look at your net results. In this example, the entire frame runs at 27 ms or 37 FPS as shown in the top right corner box (indicated by the arrow). You can toggle between FPS and GPU duration by clicking that box.


15. Click the top-right toggle button to switch between GPU duration and FPS readings

Now, if you click the Frame Overview tab, you’ll see stats for the entire frame. The Details tab provides the stats for only the erg you selected. In this example in the Frame Overview tab you can see the different metrics you can experiment with as shown below.


16. Frame Overview tab that gives the stats for the entire frame

Let’s click the Experiments tab and try completely disabling this erg so that this erg does not even render.


17. Experiment tab: Before disabling the erg


18. Experiment tab: After disabling the erg

If you go to the Frame Overview, you can see the difference in the GPU duration, execution units, etc. We can look at the general overall performance to see how much difference there is between the old and new values. In the example shown below, the delta value for GPU duration is -8 ms, the new value of the GPU duration is around 18 ms, and the percentage decrease in the GPU duration is around 35%.


19. Frame Overview and difference in GPU duration after disabling the selected erg

Anything significantly bad is marked in red. Most of these ergs are draw calls. If there is nothing highlighted when you select an erg, it can be an indication of a clear call. Sometimes the clear calls can be unnecessary. If everything in the render target is rendered without that clear call, you can try disabling it and see if there is any improvement in the GPU duration.

The API Log tab shows the draw calls being used for the ergs you have selected or if the erg is a clear call.

You can also filter by the primitive count and see how many primitives are being rendered and how many triangles are being rendered. You can set the X-axis to the GPU duration and Y-axis to the primitive count as shown below. Then you can look at the Geometry tab to see the ergs with more primitives.


20. Selecting primitive count on Y-Axis


21. Primitive count for the selected render target

You can also sort by render targets to see how long each render target takes. It’s worth experimenting to see what the hardware is doing and disable and change things to see if the performance increases or decreases by looking at the Frame Overview and seeing the delta of the performance.

About Author

Praveen Kundurthy works in the Software & Services Group at Intel Corporation. He has a Masters degree in Computer Engineering. His main interest is mobile technologies, Windows and game development.

Notices

No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.

Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.

This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps.

The products and services described may contain defects or errors known as errata which may cause deviations from published specifications. Current characterized errata are available on request.

Copies of documents which have an order number and are referenced in this document may be obtained by calling 1-800-548-4725 or by visiting www.intel.com/design/literature.htm.

Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries.

*Other names and brands may be claimed as the property of others

© 2015 Intel Corporation.


Viewing all articles
Browse latest Browse all 3384

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>