Rocprofiler-Compute version: 3.7.0
Profiler choice: rocprofiler-sdk
Output directory: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/kernel_inv_str/MI200
Target: MI210
Command: ./tests/vcopy -n 1048576 -b 256 -i 3
Kernel Selection: ['vecPaste']
Dispatch Selection: None
Filtered sections: All

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Collecting Performance Counters
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Generating native tool project using command: cmake -S /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib -B /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
-- Checking for module 'libdw'
--   Package 'libdw', required by 'virtual:world', not found
-- Could NOT find libdw (missing: libdw_LIBRARY libdw_INCLUDE_DIR)
-- {fmt} version: 12.1.0
-- Build type:
-- Configuring done (0.1s)
-- Generating done (0.0s)
-- Build files have been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
Building native tool using command: cmake --build /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build --parallel
[  0%] Built target gsl_assert
[ 33%] Built target fmt
[100%] Built target rocprofiler-compute-tool
Searching /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src by lib/_build/lib/librocprofiler-compute-tool.so for native collector
Using native collector: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
Using native counter collection tool: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
[profiling] Iteration multiplexing: Disabled
[Run 1/13][Approximate profiling time left: pending first measurement...]
[profiling] Current input file: tests/workloads/kernel_inv_str/MI200/perfmon/pmc_perf_0.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:26.010897 139246086741824 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.192206 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:26.011548 139246086741824 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:26.208055 139246086741824 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:26.297307 139246086741824 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.285759 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:26.319713 139246086741824 generateRocpd.cpp:583] writing SQL database for process 2524072 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:01:26.320526 139246086741824 generateRocpd.cpp:606] Opened result file: tests/workloads/kernel_inv_str/MI200/out/pmc_1/smc4124-25-mi210-3c48/2524072_results.db (UUID=0001fa76-6297-7297-aaaa-934506759089)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:26.404513 139246086741824 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008143 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:26.405712 139246086741824 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001183 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:26.407678 139246086741824 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001951 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:26.417974 139246086741824 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008339 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:26.743967 139246086741824 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.325979 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:26.746324 139246086741824 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002332 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:26.746342 139246086741824 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:26.755420 139246086741824 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.009071 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:26.755436 139246086741824 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:26.755442 139246086741824 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:26.755449 139246086741824 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:26.755560 139246086741824 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000101 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:26.755767 139246086741824 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.436055 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:26.758762 139246086741824 simple_timer.cpp:55] [rocprofv3] output generation ::     0.459782 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:26.758864 139246086741824 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.461507 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/kernel_inv_str/MI200/out/pmc_1/2524072_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 2/13][Approximate profiling time left: 25 seconds]...
[profiling] Current input file: tests/workloads/kernel_inv_str/MI200/perfmon/pmc_perf_1.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:28.307109 126272420798272 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.189730 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:28.307697 126272420798272 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:28.503780 126272420798272 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:28.595019 126272420798272 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.287323 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:28.617841 126272420798272 generateRocpd.cpp:583] writing SQL database for process 2524080 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:01:28.618650 126272420798272 generateRocpd.cpp:606] Opened result file: tests/workloads/kernel_inv_str/MI200/out/pmc_1/smc4124-25-mi210-3c48/2524080_results.db (UUID=0001fa76-6b92-7b92-bbac-0dc64f238a9d)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:28.702238 126272420798272 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008074 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:28.703476 126272420798272 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001221 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:28.705615 126272420798272 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002123 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:28.716161 126272420798272 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008463 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:29.034395 126272420798272 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.318216 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:29.036800 126272420798272 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002373 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:29.036818 126272420798272 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:29.046547 126272420798272 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.009722 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:29.046562 126272420798272 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:29.046568 126272420798272 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:29.046575 126272420798272 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:29.046682 126272420798272 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000097 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:29.046895 126272420798272 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.429055 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:29.049890 126272420798272 simple_timer.cpp:55] [rocprofv3] output generation ::     0.453088 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:29.049991 126272420798272 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.454897 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/kernel_inv_str/MI200/out/pmc_1/2524080_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 3/13][Approximate profiling time left: 23 seconds]...
[profiling] Current input file: tests/workloads/kernel_inv_str/MI200/perfmon/pmc_perf_2.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:30.605904 126842030350144 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.189875 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:30.606544 126842030350144 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:30.799908 126842030350144 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:30.890666 126842030350144 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.284122 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:30.912940 126842030350144 generateRocpd.cpp:583] writing SQL database for process 2524088 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:01:30.913746 126842030350144 generateRocpd.cpp:606] Opened result file: tests/workloads/kernel_inv_str/MI200/out/pmc_1/smc4124-25-mi210-3c48/2524088_results.db (UUID=0001fa76-748c-748c-a180-b123398ea51a)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:30.997295 126842030350144 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008080 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:30.998516 126842030350144 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001203 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:31.000113 126842030350144 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001582 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:31.010504 126842030350144 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008385 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:31.308721 126842030350144 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.298202 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:31.311049 126842030350144 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002312 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:31.311066 126842030350144 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:31.320242 126842030350144 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.009169 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:31.320257 126842030350144 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:31.320264 126842030350144 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:31.320271 126842030350144 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:31.320394 126842030350144 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000114 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:31.320639 126842030350144 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.407699 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:31.323655 126842030350144 simple_timer.cpp:55] [rocprofv3] output generation ::     0.431479 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:31.323747 126842030350144 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.433027 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/kernel_inv_str/MI200/out/pmc_1/2524088_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 4/13][Approximate profiling time left: 20 seconds]...
[profiling] Current input file: tests/workloads/kernel_inv_str/MI200/perfmon/pmc_perf_3.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:32.843338 134341825752896 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.191876 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:32.843948 134341825752896 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:33.040363 134341825752896 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:33.126766 134341825752896 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.282818 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:33.149077 134341825752896 generateRocpd.cpp:583] writing SQL database for process 2524097 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:01:33.149855 134341825752896 generateRocpd.cpp:606] Opened result file: tests/workloads/kernel_inv_str/MI200/out/pmc_1/smc4124-25-mi210-3c48/2524097_results.db (UUID=0001fa76-7d48-7d48-947f-23d8f8eca12d)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:33.232652 134341825752896 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008008 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:33.233851 134341825752896 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001183 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:33.236005 134341825752896 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002139 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:33.246631 134341825752896 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008490 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:33.533715 134341825752896 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.287070 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:33.536067 134341825752896 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002335 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:33.536085 134341825752896 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:33.546090 134341825752896 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.009997 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:33.546104 134341825752896 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:33.546110 134341825752896 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:33.546117 134341825752896 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:33.546228 134341825752896 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000103 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:33.546449 134341825752896 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.397373 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:33.549509 134341825752896 simple_timer.cpp:55] [rocprofv3] output generation ::     0.421289 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:33.549603 134341825752896 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.422797 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/kernel_inv_str/MI200/out/pmc_1/2524097_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 5/13][Approximate profiling time left: 18 seconds]...
[profiling] Current input file: tests/workloads/kernel_inv_str/MI200/perfmon/pmc_perf_4.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:35.090227 136247251451712 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.190865 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:35.090853 136247251451712 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:35.286264 136247251451712 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:35.372159 136247251451712 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.281306 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:35.394549 136247251451712 generateRocpd.cpp:583] writing SQL database for process 2524106 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:01:35.395341 136247251451712 generateRocpd.cpp:606] Opened result file: tests/workloads/kernel_inv_str/MI200/out/pmc_1/smc4124-25-mi210-3c48/2524106_results.db (UUID=0001fa76-8610-7610-b3c6-bdd68e34117e)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:35.476566 136247251451712 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007787 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:35.477742 136247251451712 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001159 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:35.479348 136247251451712 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001591 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:35.489588 136247251451712 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008266 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:35.769383 136247251451712 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.279779 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:35.771775 136247251451712 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002354 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:35.771794 136247251451712 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:35.780207 136247251451712 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008406 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:35.780221 136247251451712 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:35.780228 136247251451712 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:35.780234 136247251451712 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:35.780342 136247251451712 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000100 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:35.780550 136247251451712 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.386002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:35.783452 136247251451712 simple_timer.cpp:55] [rocprofv3] output generation ::     0.409542 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:35.783539 136247251451712 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.411340 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/kernel_inv_str/MI200/out/pmc_1/2524106_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 6/13][Approximate profiling time left: 15 seconds]...
[profiling] Current input file: tests/workloads/kernel_inv_str/MI200/perfmon/pmc_perf_5.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:37.278546 125599688384320 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.184305 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:37.279165 125599688384320 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:37.471771 125599688384320 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:37.554941 125599688384320 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.275776 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:37.577487 125599688384320 generateRocpd.cpp:583] writing SQL database for process 2524114 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:01:37.578290 125599688384320 generateRocpd.cpp:606] Opened result file: tests/workloads/kernel_inv_str/MI200/out/pmc_1/smc4124-25-mi210-3c48/2524114_results.db (UUID=0001fa76-8ea3-7ea3-a686-92a2cf4575db)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:37.661256 125599688384320 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007793 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:37.662455 125599688384320 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001183 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:37.664163 125599688384320 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001693 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:37.674968 125599688384320 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008644 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:37.683600 125599688384320 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.008617 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:37.685741 125599688384320 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002127 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:37.685759 125599688384320 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:37.694199 125599688384320 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008433 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:37.694214 125599688384320 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:37.694220 125599688384320 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:37.694226 125599688384320 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:37.694327 125599688384320 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000092 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:37.694525 125599688384320 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.117039 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:37.697294 125599688384320 simple_timer.cpp:55] [rocprofv3] output generation ::     0.140584 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:37.697341 125599688384320 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.142348 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/kernel_inv_str/MI200/out/pmc_1/2524114_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 7/13][Approximate profiling time left: 13 seconds]...
[profiling] Current input file: tests/workloads/kernel_inv_str/MI200/perfmon/pmc_perf_SQC_DCACHE_INFLIGHT_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:39.214569 126677890850624 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.190314 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:39.215244 126677890850624 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:39.408433 126677890850624 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:39.492925 126677890850624 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.277681 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:39.514949 126677890850624 generateRocpd.cpp:583] writing SQL database for process 2524122 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:01:39.515755 126677890850624 generateRocpd.cpp:606] Opened result file: tests/workloads/kernel_inv_str/MI200/out/pmc_1/smc4124-25-mi210-3c48/2524122_results.db (UUID=0001fa76-962d-762d-b247-340f321b11ee)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:39.597543 126677890850624 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008141 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:39.598731 126677890850624 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001171 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:39.600867 126677890850624 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002121 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:39.611722 126677890850624 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008452 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:40.019197 126677890850624 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.407457 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:40.021549 126677890850624 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002330 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:40.021567 126677890850624 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:40.029899 126677890850624 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008325 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:40.029913 126677890850624 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:40.029920 126677890850624 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:40.029927 126677890850624 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:40.030063 126677890850624 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000104 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:40.030269 126677890850624 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.515320 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:40.033154 126677890850624 simple_timer.cpp:55] [rocprofv3] output generation ::     0.538897 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:40.033269 126677890850624 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.540296 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/kernel_inv_str/MI200/out/pmc_1/2524122_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 8/13][Approximate profiling time left: 11 seconds]...
[profiling] Current input file: tests/workloads/kernel_inv_str/MI200/perfmon/pmc_perf_SQC_ICACHE_INFLIGHT_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:41.583014 140681037750080 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.189878 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:41.583624 140681037750080 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:41.778777 140681037750080 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:41.866402 140681037750080 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.282779 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:41.889140 140681037750080 generateRocpd.cpp:583] writing SQL database for process 2524130 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:01:41.889910 140681037750080 generateRocpd.cpp:606] Opened result file: tests/workloads/kernel_inv_str/MI200/out/pmc_1/smc4124-25-mi210-3c48/2524130_results.db (UUID=0001fa76-9f6e-7f6e-90a3-eea0574a6360)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:41.973327 140681037750080 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008232 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:41.974527 140681037750080 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001184 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:41.976688 140681037750080 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002146 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:41.987256 140681037750080 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008382 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:42.388561 140681037750080 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.401289 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:42.390879 140681037750080 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002291 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:42.390896 140681037750080 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:42.400122 140681037750080 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.009219 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:42.400137 140681037750080 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:42.400143 140681037750080 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:42.400150 140681037750080 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:42.400255 140681037750080 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000097 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:42.400468 140681037750080 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.511328 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:42.403358 140681037750080 simple_timer.cpp:55] [rocprofv3] output generation ::     0.535318 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:42.403467 140681037750080 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.537018 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/kernel_inv_str/MI200/out/pmc_1/2524130_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 9/13][Approximate profiling time left: 8 seconds]...
[profiling] Current input file: tests/workloads/kernel_inv_str/MI200/perfmon/pmc_perf_SQ_IFETCH_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:43.969757 129554613083968 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.196181 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:43.970357 129554613083968 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:44.163588 129554613083968 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:44.251275 129554613083968 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.280918 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:44.273562 129554613083968 generateRocpd.cpp:583] writing SQL database for process 2524138 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:01:44.274370 129554613083968 generateRocpd.cpp:606] Opened result file: tests/workloads/kernel_inv_str/MI200/out/pmc_1/smc4124-25-mi210-3c48/2524138_results.db (UUID=0001fa76-a8ba-78ba-9950-5f0cb711e721)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:44.357270 129554613083968 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008394 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:44.358476 129554613083968 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001189 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:44.360458 129554613083968 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001965 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:44.371026 129554613083968 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008612 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:44.958071 129554613083968 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.587022 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:44.960532 129554613083968 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002435 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:44.960550 129554613083968 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:44.969873 129554613083968 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.009315 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:44.969888 129554613083968 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:44.969894 129554613083968 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:44.969901 129554613083968 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:44.970048 129554613083968 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000139 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:44.970330 129554613083968 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.696768 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:44.973349 129554613083968 simple_timer.cpp:55] [rocprofv3] output generation ::     0.720840 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:44.973495 129554613083968 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.722168 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/kernel_inv_str/MI200/out/pmc_1/2524138_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 10/13][Approximate profiling time left: 6 seconds]...
[profiling] Current input file: tests/workloads/kernel_inv_str/MI200/perfmon/pmc_perf_SQ_INST_LEVEL_LDS_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:46.541431 133703731257152 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.188362 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:46.542014 133703731257152 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:46.734341 133703731257152 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:46.818865 133703731257152 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.276851 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:46.841066 133703731257152 generateRocpd.cpp:583] writing SQL database for process 2524146 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:01:46.841869 133703731257152 generateRocpd.cpp:606] Opened result file: tests/workloads/kernel_inv_str/MI200/out/pmc_1/smc4124-25-mi210-3c48/2524146_results.db (UUID=0001fa76-b2ce-72ce-b489-c76f81de9269)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:46.925479 133703731257152 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008117 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:46.926714 133703731257152 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001219 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:46.928863 133703731257152 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002135 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:46.939347 133703731257152 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008380 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:47.284095 133703731257152 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.344734 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:47.286399 133703731257152 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002280 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:47.286416 133703731257152 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:47.295767 133703731257152 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.009344 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:47.295782 133703731257152 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:47.295788 133703731257152 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:47.295795 133703731257152 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:47.295903 133703731257152 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000101 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:47.296110 133703731257152 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.455045 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:47.299082 133703731257152 simple_timer.cpp:55] [rocprofv3] output generation ::     0.478788 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:47.299176 133703731257152 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.480261 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/kernel_inv_str/MI200/out/pmc_1/2524146_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 11/13][Approximate profiling time left: 4 seconds]...
[profiling] Current input file: tests/workloads/kernel_inv_str/MI200/perfmon/pmc_perf_SQ_INST_LEVEL_SMEM_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:48.847770 131472959299392 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.191966 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:48.848386 131472959299392 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:49.042337 131472959299392 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:49.136125 131472959299392 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.287739 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:49.158994 131472959299392 generateRocpd.cpp:583] writing SQL database for process 2524155 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:01:49.159798 131472959299392 generateRocpd.cpp:606] Opened result file: tests/workloads/kernel_inv_str/MI200/out/pmc_1/smc4124-25-mi210-3c48/2524155_results.db (UUID=0001fa76-bbcc-7bcc-9ca1-8a347016af15)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:49.243513 131472959299392 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008114 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:49.244727 131472959299392 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001198 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:49.246681 131472959299392 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001939 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:49.257035 131472959299392 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008403 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:49.587451 131472959299392 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.330402 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:49.589898 131472959299392 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002423 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:49.589916 131472959299392 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:49.598377 131472959299392 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008455 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:49.598392 131472959299392 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:49.598398 131472959299392 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:49.598405 131472959299392 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:49.598507 131472959299392 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000095 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:49.598724 131472959299392 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.439730 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:49.601707 131472959299392 simple_timer.cpp:55] [rocprofv3] output generation ::     0.463783 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:49.601816 131472959299392 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.465641 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/kernel_inv_str/MI200/out/pmc_1/2524155_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 12/13][Approximate profiling time left: 2 seconds]...
[profiling] Current input file: tests/workloads/kernel_inv_str/MI200/perfmon/pmc_perf_SQ_INST_LEVEL_VMEM_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:51.171241 133239432167232 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.196967 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:51.171882 133239432167232 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:51.367054 133239432167232 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:51.470887 133239432167232 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.299005 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:51.492903 133239432167232 generateRocpd.cpp:583] writing SQL database for process 2524164 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:01:51.493686 133239432167232 generateRocpd.cpp:606] Opened result file: tests/workloads/kernel_inv_str/MI200/out/pmc_1/smc4124-25-mi210-3c48/2524164_results.db (UUID=0001fa76-c4db-74db-878a-5012dc8f786a)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:51.576899 133239432167232 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008176 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:51.578118 133239432167232 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001203 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:51.580053 133239432167232 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001920 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:51.590507 133239432167232 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008430 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:52.117984 133239432167232 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.527463 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:52.120366 133239432167232 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002355 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:52.120384 133239432167232 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:52.129287 133239432167232 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008896 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:52.129302 133239432167232 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:52.129309 133239432167232 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:52.129316 133239432167232 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:52.129475 133239432167232 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000121 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:52.129744 133239432167232 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.636840 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:52.132747 133239432167232 simple_timer.cpp:55] [rocprofv3] output generation ::     0.660436 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:52.132872 133239432167232 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.661945 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/kernel_inv_str/MI200/out/pmc_1/2524164_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 13/13][Approximate profiling time left: 0 seconds]...
[profiling] Current input file: tests/workloads/kernel_inv_str/MI200/perfmon/pmc_perf_SQ_LEVEL_WAVES_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:53.653890 128526494211904 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.189877 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:53.654502 128526494211904 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:53.848882 128526494211904 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:01:53.934188 128526494211904 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.279686 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:53.956871 128526494211904 generateRocpd.cpp:583] writing SQL database for process 2524173 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:01:53.957680 128526494211904 generateRocpd.cpp:606] Opened result file: tests/workloads/kernel_inv_str/MI200/out/pmc_1/smc4124-25-mi210-3c48/2524173_results.db (UUID=0001fa76-ce94-7e94-ba62-c3a0f91243a4)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:54.041376 128526494211904 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008044 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:54.042611 128526494211904 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001218 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:54.044328 128526494211904 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001701 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:54.055011 128526494211904 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008498 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:54.374652 128526494211904 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.319626 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:54.377011 128526494211904 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002331 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:54.377035 128526494211904 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000010 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:54.386574 128526494211904 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.009529 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:54.386589 128526494211904 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:54.386596 128526494211904 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:54.386602 128526494211904 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:54.386705 128526494211904 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000096 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:54.386894 128526494211904 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.430023 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:54.389897 128526494211904 simple_timer.cpp:55] [rocprofv3] output generation ::     0.453915 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:01:54.389981 128526494211904 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.455752 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/kernel_inv_str/MI200/out/pmc_1/2524173_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
PC sampling data collection skipped as block 21 is not specified.
[roofline] Checking for roofline.csv in tests/workloads/kernel_inv_str/MI200
[roofline] Benchmark execution failed: 'L1'. Skipping roofline.
