alias: cu_ins, block id: 10
alias: cu_pipe, block id: 11
Rocprofiler-Compute version: 3.7.0
Profiler choice: rocprofiler-sdk
Output directory: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_SQ/MI200
Target: MI210
Command: ./tests/vcopy -n 1048576 -b 256 -i 3
Kernel Selection: None
Dispatch Selection: None
Filtered sections: ['cu_ins', 'cu_pipe']

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Collecting Performance Counters
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Generating native tool project using command: cmake -S /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib -B /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
-- Checking for module 'libdw'
--   Package 'libdw', required by 'virtual:world', not found
-- Could NOT find libdw (missing: libdw_LIBRARY libdw_INCLUDE_DIR)
-- {fmt} version: 12.1.0
-- Build type:
-- Configuring done (0.1s)
-- Generating done (0.0s)
-- Build files have been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
Building native tool using command: cmake --build /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build --parallel
[  0%] Built target gsl_assert
[ 33%] Built target fmt
[100%] Built target rocprofiler-compute-tool
Searching /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src by lib/_build/lib/librocprofiler-compute-tool.so for native collector
Using native collector: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
Using native counter collection tool: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
[profiling] Iteration multiplexing: Disabled
[Run 1/7][Approximate profiling time left: pending first measurement...]
[profiling] Current input file: tests/workloads/ipblocks_SQ/MI200/perfmon/pmc_perf_0.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:18.549248 139298121277248 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.187040 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:18.549868 139298121277248 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:18.743465 139298121277248 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:18.832094 139298121277248 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.282226 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:18.854376 139298121277248 generateRocpd.cpp:583] writing SQL database for process 2524890 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:03:18.855180 139298121277248 generateRocpd.cpp:606] Opened result file: tests/workloads/ipblocks_SQ/MI200/out/pmc_1/smc4124-25-mi210-3c48/2524890_results.db (UUID=0001fa78-1a37-7a37-a2e1-188d7406f77a)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:18.937285 139298121277248 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008272 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:18.938478 139298121277248 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001176 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:18.940058 139298121277248 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001566 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:18.950450 139298121277248 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008434 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:19.020432 139298121277248 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.069967 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:19.022550 139298121277248 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002103 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:19.022568 139298121277248 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:19.031234 139298121277248 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008659 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:19.031248 139298121277248 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:19.031254 139298121277248 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:19.031260 139298121277248 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:19.031362 139298121277248 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000093 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:19.031566 139298121277248 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.177190 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:19.034573 139298121277248 simple_timer.cpp:55] [rocprofv3] output generation ::     0.200898 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:19.034635 139298121277248 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.202500 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/ipblocks_SQ/MI200/out/pmc_1/2524890_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 2/7][Approximate profiling time left: 10 seconds]...
[profiling] Current input file: tests/workloads/ipblocks_SQ/MI200/perfmon/pmc_perf_1.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:20.532986 131846554853184 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.183734 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:20.533583 131846554853184 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:20.728935 131846554853184 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:20.811830 131846554853184 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.278247 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:20.834497 131846554853184 generateRocpd.cpp:583] writing SQL database for process 2524898 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:03:20.835298 131846554853184 generateRocpd.cpp:606] Opened result file: tests/workloads/ipblocks_SQ/MI200/out/pmc_1/smc4124-25-mi210-3c48/2524898_results.db (UUID=0001fa78-21fa-71fa-8aae-7a4ec90d3693)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:20.917448 131846554853184 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007751 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:20.918668 131846554853184 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001202 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:20.920358 131846554853184 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001675 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:20.930792 131846554853184 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008370 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:21.001326 131846554853184 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.070519 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:21.003600 131846554853184 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002258 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:21.003617 131846554853184 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:21.012408 131846554853184 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008784 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:21.012422 131846554853184 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:21.012429 131846554853184 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:21.012435 131846554853184 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:21.012538 131846554853184 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000095 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:21.012761 131846554853184 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.178264 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:21.015852 131846554853184 simple_timer.cpp:55] [rocprofv3] output generation ::     0.202306 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:21.015914 131846554853184 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.204036 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/ipblocks_SQ/MI200/out/pmc_1/2524898_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 3/7][Approximate profiling time left: 7 seconds]...
[profiling] Current input file: tests/workloads/ipblocks_SQ/MI200/perfmon/pmc_perf_2.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:22.503646 129711382994752 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.185966 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:22.504288 129711382994752 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:22.697456 129711382994752 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:22.782587 129711382994752 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.278300 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:22.805214 129711382994752 generateRocpd.cpp:583] writing SQL database for process 2524919 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:03:22.806008 129711382994752 generateRocpd.cpp:606] Opened result file: tests/workloads/ipblocks_SQ/MI200/out/pmc_1/smc4124-25-mi210-3c48/2524919_results.db (UUID=0001fa78-29aa-79aa-b58a-4aedfae0553e)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:22.888566 129711382994752 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007829 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:22.889773 129711382994752 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001190 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:22.891369 129711382994752 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001581 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:22.901772 129711382994752 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008430 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:22.960397 129711382994752 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.058610 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:22.962728 129711382994752 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002314 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:22.962745 129711382994752 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:22.971248 129711382994752 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008495 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:22.971267 129711382994752 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:22.971273 129711382994752 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:22.971280 129711382994752 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:22.971412 129711382994752 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000095 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:22.971644 129711382994752 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.166431 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:22.974478 129711382994752 simple_timer.cpp:55] [rocprofv3] output generation ::     0.190019 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:22.974527 129711382994752 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.191891 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/ipblocks_SQ/MI200/out/pmc_1/2524919_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 4/7][Approximate profiling time left: 5 seconds]...
[profiling] Current input file: tests/workloads/ipblocks_SQ/MI200/perfmon/pmc_perf_3.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:24.486185 125391296487232 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.187315 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:24.486799 125391296487232 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:24.679875 125391296487232 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:24.775443 125391296487232 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.288644 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:24.797536 125391296487232 generateRocpd.cpp:583] writing SQL database for process 2524928 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:03:24.798345 125391296487232 generateRocpd.cpp:606] Opened result file: tests/workloads/ipblocks_SQ/MI200/out/pmc_1/smc4124-25-mi210-3c48/2524928_results.db (UUID=0001fa78-3167-7167-9c2b-7ad9cdc2db64)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:24.881735 125391296487232 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007930 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:24.882978 125391296487232 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001227 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:24.884678 125391296487232 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001685 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:24.895573 125391296487232 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008661 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:24.952780 125391296487232 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.057192 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:24.955178 125391296487232 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002383 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:24.955196 125391296487232 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:24.964342 125391296487232 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.009139 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:24.964356 125391296487232 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:24.964362 125391296487232 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:24.964369 125391296487232 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:24.964470 125391296487232 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000093 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:24.964668 125391296487232 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.167132 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:24.967656 125391296487232 simple_timer.cpp:55] [rocprofv3] output generation ::     0.190807 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:24.967723 125391296487232 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.192232 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/ipblocks_SQ/MI200/out/pmc_1/2524928_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 5/7][Approximate profiling time left: 3 seconds]...
[profiling] Current input file: tests/workloads/ipblocks_SQ/MI200/perfmon/pmc_perf_4.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:26.479815 134463798599488 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.182649 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:26.480404 134463798599488 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:26.675946 134463798599488 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:26.755591 134463798599488 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.275187 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:26.777724 134463798599488 generateRocpd.cpp:583] writing SQL database for process 2524937 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:03:26.778530 134463798599488 generateRocpd.cpp:606] Opened result file: tests/workloads/ipblocks_SQ/MI200/out/pmc_1/smc4124-25-mi210-3c48/2524937_results.db (UUID=0001fa78-3936-7936-aa75-0d8e9fb3deee)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:26.858791 134463798599488 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007815 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:26.859981 134463798599488 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001173 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:26.861580 134463798599488 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001582 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:26.872010 134463798599488 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008459 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:26.887761 134463798599488 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.015735 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:26.889984 134463798599488 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002207 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:26.890004 134463798599488 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:26.899000 134463798599488 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008981 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:26.899016 134463798599488 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:26.899044 134463798599488 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:26.899055 134463798599488 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:26.899154 134463798599488 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000092 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:26.899328 134463798599488 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.121604 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:26.902143 134463798599488 simple_timer.cpp:55] [rocprofv3] output generation ::     0.145142 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:26.902193 134463798599488 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.146544 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/ipblocks_SQ/MI200/out/pmc_1/2524937_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 6/7][Approximate profiling time left: 1 second]...
[profiling] Current input file: tests/workloads/ipblocks_SQ/MI200/perfmon/pmc_perf_SQ_INST_LEVEL_SMEM_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:28.410286 127264151662400 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.182621 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:28.410889 127264151662400 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:28.604006 127264151662400 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:28.685759 127264151662400 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.274871 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:28.708362 127264151662400 generateRocpd.cpp:583] writing SQL database for process 2524946 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:03:28.709176 127264151662400 generateRocpd.cpp:606] Opened result file: tests/workloads/ipblocks_SQ/MI200/out/pmc_1/smc4124-25-mi210-3c48/2524946_results.db (UUID=0001fa78-40c0-70c0-a990-045eecc6f66e)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:28.789289 127264151662400 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007716 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:28.790483 127264151662400 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001179 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:28.792175 127264151662400 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001676 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:28.802890 127264151662400 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008582 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:28.875026 127264151662400 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.072121 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:28.877428 127264151662400 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002356 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:28.877445 127264151662400 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:28.885886 127264151662400 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008433 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:28.885900 127264151662400 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:28.885907 127264151662400 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:28.885913 127264151662400 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:28.886017 127264151662400 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000096 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:28.886228 127264151662400 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.177867 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:28.889186 127264151662400 simple_timer.cpp:55] [rocprofv3] output generation ::     0.201632 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:28.889250 127264151662400 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.203439 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/ipblocks_SQ/MI200/out/pmc_1/2524946_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 7/7][Approximate profiling time left: 0 seconds]...
[profiling] Current input file: tests/workloads/ipblocks_SQ/MI200/perfmon/pmc_perf_SQ_INST_LEVEL_VMEM_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:30.419749 139873146011456 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.181935 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:30.420362 139873146011456 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:30.612522 139873146011456 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:03:30.693505 139873146011456 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.273143 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:30.715769 139873146011456 generateRocpd.cpp:583] writing SQL database for process 2524954 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:03:30.716551 139873146011456 generateRocpd.cpp:606] Opened result file: tests/workloads/ipblocks_SQ/MI200/out/pmc_1/smc4124-25-mi210-3c48/2524954_results.db (UUID=0001fa78-489a-789a-bf46-c0687c9aaf48)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:30.797956 139873146011456 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007787 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:30.799202 139873146011456 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001230 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:30.800764 139873146011456 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001547 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:30.811058 139873146011456 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008325 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:30.874325 139873146011456 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.063251 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:30.876565 139873146011456 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002225 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:30.876583 139873146011456 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:30.884952 139873146011456 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008362 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:30.884967 139873146011456 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:30.884973 139873146011456 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:30.884980 139873146011456 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:30.885088 139873146011456 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000100 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:30.885307 139873146011456 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.169538 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:30.888158 139873146011456 simple_timer.cpp:55] [rocprofv3] output generation ::     0.193224 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:03:30.888218 139873146011456 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.194670 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/ipblocks_SQ/MI200/out/pmc_1/2524954_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
PC sampling data collection skipped as block 21 is not specified.
[roofline] Skipping roofline
