alias: spi, block id: 6
Rocprofiler-Compute version: 3.7.0
Profiler choice: rocprofiler-sdk
Output directory: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_SPI/MI200
Target: MI210
Command: ./tests/vcopy -n 1048576 -b 256 -i 3
Kernel Selection: None
Dispatch Selection: None
Filtered sections: ['spi']

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Collecting Performance Counters
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Generating native tool project using command: cmake -S /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib -B /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
-- Checking for module 'libdw'
--   Package 'libdw', required by 'virtual:world', not found
-- Could NOT find libdw (missing: libdw_LIBRARY libdw_INCLUDE_DIR)
-- {fmt} version: 12.1.0
-- Build type:
-- Configuring done (0.1s)
-- Generating done (0.0s)
-- Build files have been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
Building native tool using command: cmake --build /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build --parallel
[  0%] Built target gsl_assert
[ 33%] Built target fmt
[100%] Built target rocprofiler-compute-tool
Searching /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src by lib/_build/lib/librocprofiler-compute-tool.so for native collector
Using native collector: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
Using native counter collection tool: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
[profiling] Iteration multiplexing: Disabled
[Run 1/3][Approximate profiling time left: pending first measurement...]
[profiling] Current input file: tests/workloads/ipblocks_SPI/MI200/perfmon/pmc_perf_0.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:07:18.697158 127974632742720 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.186734 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:07:18.697761 127974632742720 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:18.891378 127974632742720 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:07:18.980145 127974632742720 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.282384 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:19.002509 127974632742720 generateRocpd.cpp:583] writing SQL database for process 2525887 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:07:19.003294 127974632742720 generateRocpd.cpp:606] Opened result file: tests/workloads/ipblocks_SPI/MI200/out/pmc_1/smc4124-25-mi210-3c48/2525887_results.db (UUID=0001fa7b-c44b-744b-878e-2ac8bd3cdcda)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:19.085118 127974632742720 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007690 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:19.086217 127974632742720 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001081 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:19.087796 127974632742720 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001563 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:19.098062 127974632742720 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008282 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:19.175349 127974632742720 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.077271 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:19.177778 127974632742720 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002415 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:19.177796 127974632742720 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:19.186766 127974632742720 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008963 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:19.186779 127974632742720 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:19.186786 127974632742720 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:19.186792 127974632742720 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:19.186892 127974632742720 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000091 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:19.187117 127974632742720 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.184609 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:19.190084 127974632742720 simple_timer.cpp:55] [rocprofv3] output generation ::     0.208172 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:19.190143 127974632742720 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.209959 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/ipblocks_SPI/MI200/out/pmc_1/2525887_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 2/3][Approximate profiling time left: 2 seconds]...
[profiling] Current input file: tests/workloads/ipblocks_SPI/MI200/perfmon/pmc_perf_1.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:07:20.691958 124238095073088 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.184554 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:07:20.692550 124238095073088 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:20.888279 124238095073088 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:07:20.984910 124238095073088 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.292360 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:21.006811 124238095073088 generateRocpd.cpp:583] writing SQL database for process 2525903 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:07:21.007601 124238095073088 generateRocpd.cpp:606] Opened result file: tests/workloads/ipblocks_SPI/MI200/out/pmc_1/smc4124-25-mi210-3c48/2525903_results.db (UUID=0001fa7b-cc18-7c18-a91b-31139e6768c5)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:21.087088 124238095073088 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007714 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:21.088181 124238095073088 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001076 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:21.089755 124238095073088 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001559 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:21.099844 124238095073088 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008119 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:21.149394 124238095073088 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.049536 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:21.151446 124238095073088 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002038 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:21.151463 124238095073088 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:21.159979 124238095073088 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008509 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:21.159993 124238095073088 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:21.159999 124238095073088 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:21.160006 124238095073088 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:21.160136 124238095073088 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000096 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:21.160359 124238095073088 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.153548 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:21.163227 124238095073088 simple_timer.cpp:55] [rocprofv3] output generation ::     0.176982 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:21.163278 124238095073088 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.178329 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/ipblocks_SPI/MI200/out/pmc_1/2525903_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 3/3][Approximate profiling time left: 0 seconds]...
[profiling] Current input file: tests/workloads/ipblocks_SPI/MI200/perfmon/pmc_perf_2.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:07:22.652805 134692972478272 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.184798 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:07:22.653421 134692972478272 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:22.847221 134692972478272 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:07:22.931178 134692972478272 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.277758 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:22.953821 134692972478272 generateRocpd.cpp:583] writing SQL database for process 2525913 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:07:22.954601 134692972478272 generateRocpd.cpp:606] Opened result file: tests/workloads/ipblocks_SPI/MI200/out/pmc_1/smc4124-25-mi210-3c48/2525913_results.db (UUID=0001fa7b-d3c0-73c0-b24e-cc31c4807756)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:23.034685 134692972478272 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007722 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:23.035798 134692972478272 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001097 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:23.037374 134692972478272 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001561 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:23.047460 134692972478272 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008148 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:23.076364 134692972478272 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.028889 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:23.078364 134692972478272 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.001985 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:23.078380 134692972478272 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:23.086583 134692972478272 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008195 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:23.086598 134692972478272 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:23.086605 134692972478272 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:23.086611 134692972478272 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:23.086715 134692972478272 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000095 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:23.086897 134692972478272 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.133077 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:23.089805 134692972478272 simple_timer.cpp:55] [rocprofv3] output generation ::     0.156896 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:07:23.089855 134692972478272 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.158637 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/ipblocks_SPI/MI200/out/pmc_1/2525913_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
PC sampling data collection skipped as block 21 is not specified.
[roofline] Skipping roofline
