Sparse Instructions Analysis Tool
Intel® SDE provides special analysis tool for Sparse Instructions. Currently this is applicable only to Intel® AVX and Intel® ABX-512 gather instructions.
This tool was written to help analysis of the Intel® compiler support for gather instructions. For each gather instruction it calculate the vector of the distances between each two consecutive elements (with set mask bits). Based on this vector, it support three modes of analysis.
The first analysis type is a dump of each gather instruction with its current distances vector.
The second analysis type is a dump of every distances vector per instruction and how many times the distances vector was executed
The third analysis type is calculating histogram of how many execution happen with a constant distance between all elements (and what is this distance), and how many time the distances vector random had random distances.
The first analysis type, the sparse instructions trace is activated when specifying -sparse_dump. Example for the output file:
TID 0 0x40121e vgatherdps zmm1, k1, dword ptr [rbp+zmm0*4-0x10f0] [8, 8, 8, 8]
TID 0 0x40121e vgatherdps zmm1, k1, dword ptr [rbp+zmm0*4-0x10f0] [8, 8, 8, 8]
TID 0 0x4012a8 vgatherdps zmm1, k1, dword ptr [rbp+zmm0*4-0x10f0] [20, 12, 28]
TID 0 0x4012a8 vgatherdps zmm1, k1, dword ptr [rbp+zmm0*4-0x10f0] [20, 12, 28]
TID 0 0x4012a8 vgatherdps zmm1, k1, dword ptr [rbp+zmm0*4-0x10f0] [20, 12, 28]
TID 0 0x4012a8 vgatherdps zmm1, k1, dword ptr [rbp+zmm0*4-0x10f0] [20, 12, 28]
TID 0 0x4012a8 vgatherdps zmm1, k1, dword ptr [rbp+zmm0*4-0x10f0] [20, 12, 28]
TID 0 0x401332 vgatherdps zmm1, k1, dword ptr [rbp+zmm0*4-0x10f0] [8, 8, 8, 8]
TID 0 0x401332 vgatherdps zmm1, k1, dword ptr [rbp+zmm0*4-0x10f0] [8, 8, 8, 8]
TID 0 0x401332 vgatherdps zmm1, k1, dword ptr [rbp+zmm0*4-0x10f0] [8, 8, 8, 8]
TID 0 0x401332 vgatherdps zmm1, k1, dword ptr [rbp+zmm0*4-0x10f0] [8, 8, 8, 8]
TID 0 0x4013bc vgatherdps zmm1, k1, dword ptr [rbp+zmm0*4-0x10f0] [4, 4, 4, 4]
TID 0 0x4013bc vgatherdps zmm1, k1, dword ptr [rbp+zmm0*4-0x10f0] [4, 4, 4, 4]
TID 0 0x4013bc vgatherdps zmm1, k1, dword ptr [rbp+zmm0*4-0x10f0] [4, 4, 4, 4]
TID 0 0x401446 vgatherdps zmm1, k1, dword ptr [rbp+zmm0*4-0x10f0] [20, 12, 4]
TID 0 0x401446 vgatherdps zmm1, k1, dword ptr [rbp+zmm0*4-0x10f0] [20, 12, 4]
TID 0 0x401446 vgatherdps zmm1, k1, dword ptr [rbp+zmm0*4-0x10f0] [20, 12, 4]
The second analysis type, the sparse vector trace is activated when specifying -sparse_vector. Example for the output file:
TID 0 0x40121e vgatherdps zmm1, k1, dword ptr [rbp+zmm0*4-0x10f0] [8, 8, 8, 8] 5
TID 0 0x401332 vgatherdps zmm1, k1, dword ptr [rbp+zmm0*4-0x10f0] [8, 8, 8, 8] 4
TID 0 0x401446 vgatherdps zmm1, k1, dword ptr [rbp+zmm0*4-0x10f0] [20, 12, 4] 7
TID 0 0x4012a8 vgatherdps zmm1, k1, dword ptr [rbp+zmm0*4-0x10f0] [20, 12, 28] 5
TID 0 0x4013bc vgatherdps zmm1, k1, dword ptr [rbp+zmm0*4-0x10f0] [4, 4, 4, 4] 3
The third analysis type, the sparse analysis is activated when specifying -sparse_analysis. Example for the output file:
====================
TID 0
====================
0x40121e vgatherdps zmm1, k1, dword ptr [rbp+zmm0*4-0x10f0] {8: 5, random: 0} num_elem: {5: 5, } cache lines: {1: 5, }
0x4012a8 vgatherdps zmm1, k1, dword ptr [rbp+zmm0*4-0x10f0] {random: 5} num_elem: {4: 5, } cache lines: {1: 5, }
0x401332 vgatherdps zmm1, k1, dword ptr [rbp+zmm0*4-0x10f0] {8: 4, random: 0} num_elem: {5: 4, } cache lines: {1: 4, }
0x4013bc vgatherdps zmm1, k1, dword ptr [rbp+zmm0*4-0x10f0] {4: 3, random: 0} num_elem: {5: 3, } cache lines: {1: 3, }
0x401446 vgatherdps zmm1, k1, dword ptr [rbp+zmm0*4-0x10f0] {random: 7} num_elem: {4: 7, } cache lines: {1: 7, }
The sparse analysis tool knobs are:
- -sparse_analysis
Activate analysis of memory access distances in sparse instructions [default 0]
- -sparse_analysis_out
Output file name for the sparse analysis tool [default sde-sparse-analysis-out.txt]
- -sparse_dump
Activate full dump of memory access distances in sparse instructions [default 0]
- -sparse_dump_out
Output file name for the sparse dump tool [default sde-sparse-dump-out.txt]
- -sparse_vector
Activate statistics on distances vectors in sparse instructions [default 0]
- -sparse_vector_constant
Display only constant distances vectors (ignore random vectors) [default 0]
- -sparse_vector_debug_hash
Testing the efficiency of the hash function [default 0]
- -sparse_vector_full
Display full details on each distances vector [default 1]
- -sparse_vector_out
Output file name for the sparse vector tool [default sde-sparse-vector-out.txt]
- -sparse_vector_random
Display only random distances vectors (ignore constant vectors) [default 0]
- -sparse_vector_top
Specifying the number of most used vectors to be displayed [default 0]
Note
You can activate more than one analysis type at the same time, in this case use different file names for each analysis.