Following counters might not be supported by rocprof: SQ_INSTS_VALU_ADD_F16, SQ_INSTS_VALU_MUL_F64, SQ_INSTS_VALU_MFMA_MOPS_F32, SQ_INSTS_VALU_ADD_F64, SQ_INSTS_VALU_TRANS_F16, SQ_INSTS_VALU_ADD_F32, SQ_INSTS_VALU_TRANS_F32, SQ_INSTS_VALU_FMA_F16, SQ_INSTS_VALU_MUL_F16, SQ_INSTS_VALU_MUL_F32, SQ_INSTS_VALU_FMA_F64, SQ_INSTS_VALU_FMA_F32, SQ_INSTS_VALU_MFMA_MOPS_F64, SQ_INSTS_VALU_MFMA_MOPS_F16, SQ_INSTS_VALU_TRANS_F64, SQ_INSTS_VALU_MFMA_MOPS_I8, SQ_INSTS_VALU_MFMA_MOPS_BF16
Rocprofiler-Compute version: 3.7.0
Profiler choice: rocprofiler-sdk
Output directory: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_0/MI100
Target: MI100
Command: ./tests/vcopy -n 1048576 -b 256 -i 3
Kernel Selection: None
Dispatch Selection: ['1']
Filtered sections: All

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Collecting Performance Counters
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Generating native tool project using command: cmake -S /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib -B /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
-- Checking for module 'libdw'
--   Package 'libdw', required by 'virtual:world', not found
-- Could NOT find libdw (missing: libdw_LIBRARY libdw_INCLUDE_DIR)
-- {fmt} version: 12.1.0
-- Build type:
-- Configuring done (0.2s)
-- Generating done (0.0s)
-- Build files have been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
Building native tool using command: cmake --build /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build --parallel
[  0%] Built target gsl_assert
[ 33%] Built target fmt
[100%] Built target rocprofiler-compute-tool
Searching /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src by lib/_build/lib/librocprofiler-compute-tool.so for native collector
Using native collector: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
Using native counter collection tool: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
[profiling] Iteration multiplexing: Disabled
[Run 1/12][Approximate profiling time left: pending first measurement...]
[profiling] Current input file: tests/workloads/dispatch_0/MI100/perfmon/pmc_perf_0.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:17:58.559812 132742365671232 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.304727 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:17:58.569732 132742365671232 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:17:58.781400 132742365671232 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] [33m[rocprofiler-compute] [create_counter_collection_profile] WARNING: Requested counters not available: SQ_INSTS_VALU_ADD_F16, SQ_INSTS_VALU_ADD_F32, SQ_INSTS_VALU_ADD_F64, SQ_INSTS_VALU_FMA_F16[0m
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:17:58.915227 132742365671232 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.345495 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:17:58.954308 132742365671232 generateRocpd.cpp:582] writing SQL database for process 2390543 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:17:58.955573 132742365671232 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_0/MI100/out/pmc_1/dl385-20-mi100-3c48/2390543_results.db (UUID=0000432e-f317-7317-9dec-b41b548de381)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:17:59.046826 132742365671232 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014503 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:17:59.048038 132742365671232 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001178 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:17:59.050557 132742365671232 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002491 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:17:59.055705 132742365671232 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003182 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:17:59.125251 132742365671232 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.069516 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:17:59.128170 132742365671232 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002888 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:17:59.128199 132742365671232 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000003 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:17:59.144338 132742365671232 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.016124 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:17:59.144370 132742365671232 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:17:59.144382 132742365671232 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:17:59.144394 132742365671232 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:17:59.144618 132742365671232 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000205 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:17:59.145190 132742365671232 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.190882 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:17:59.151136 132742365671232 simple_timer.cpp:55] [rocprofv3] output generation ::     0.233424 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:17:59.151231 132742365671232 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.235953 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_0/MI100/out/pmc_1/2390543_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 2/12][Approximate profiling time left: 33 seconds]...
[profiling] Current input file: tests/workloads/dispatch_0/MI100/perfmon/pmc_perf_1.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:01.430925 132784366845760 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.306099 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:01.439695 132784366845760 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:01.650820 132784366845760 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] [33m[rocprofiler-compute] [create_counter_collection_profile] WARNING: Requested counters not available: SQ_INSTS_VALU_FMA_F32, SQ_INSTS_VALU_FMA_F64, SQ_INSTS_VALU_MFMA_MOPS_BF16, SQ_INSTS_VALU_MFMA_MOPS_F16, SQ_INSTS_VALU_MFMA_MOPS_F32, SQ_INSTS_VALU_MFMA_MOPS_F64, SQ_INSTS_VALU_MFMA_MOPS_I8, SQ_INSTS_VALU_MUL_F16[0m
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:01.781345 132784366845760 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.341650 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:01.820570 132784366845760 generateRocpd.cpp:582] writing SQL database for process 2390555 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:18:01.821798 132784366845760 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_0/MI100/out/pmc_1/dl385-20-mi100-3c48/2390555_results.db (UUID=0000432e-fe4d-7e4d-a0b5-52a8ed5c3250)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:01.912884 132784366845760 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014541 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:01.914025 132784366845760 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001109 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:01.916220 132784366845760 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002166 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:01.921342 132784366845760 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003198 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:01.984602 132784366845760 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.063230 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:01.987550 132784366845760 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002916 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:01.987579 132784366845760 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000003 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:02.003151 132784366845760 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015557 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:02.003180 132784366845760 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:02.003192 132784366845760 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:02.003204 132784366845760 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:02.003404 132784366845760 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000183 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:02.003818 132784366845760 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.183248 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:02.009643 132784366845760 simple_timer.cpp:55] [rocprofv3] output generation ::     0.225867 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:02.009723 132784366845760 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.228328 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_0/MI100/out/pmc_1/2390555_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 3/12][Approximate profiling time left: 27 seconds]...
[profiling] Current input file: tests/workloads/dispatch_0/MI100/perfmon/pmc_perf_2.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:04.266952 130801155186496 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.303379 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:04.276752 130801155186496 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:04.488509 130801155186496 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] [33m[rocprofiler-compute] [create_counter_collection_profile] WARNING: Requested counters not available: SQ_INSTS_VALU_MUL_F32, SQ_INSTS_VALU_MUL_F64, SQ_INSTS_VALU_TRANS_F16, SQ_INSTS_VALU_TRANS_F32, SQ_INSTS_VALU_TRANS_F64[0m
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:04.621559 130801155186496 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.344807 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:04.660803 130801155186496 generateRocpd.cpp:582] writing SQL database for process 2390567 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:18:04.662068 130801155186496 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_0/MI100/out/pmc_1/dl385-20-mi100-3c48/2390567_results.db (UUID=0000432f-0964-7964-b541-5876c5df58bc)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:04.753807 130801155186496 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014386 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:04.754946 130801155186496 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001107 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:04.757132 130801155186496 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002143 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:04.762216 130801155186496 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003166 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:04.830532 130801155186496 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.068287 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:04.833445 130801155186496 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002883 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:04.833474 130801155186496 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:04.848921 130801155186496 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015432 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:04.848949 130801155186496 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:04.848961 130801155186496 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:04.848987 130801155186496 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:04.849182 130801155186496 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000180 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:04.849619 130801155186496 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.188817 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:04.855407 130801155186496 simple_timer.cpp:55] [rocprofv3] output generation ::     0.231357 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:04.855491 130801155186496 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.233882 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_0/MI100/out/pmc_1/2390567_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 4/12][Approximate profiling time left: 24 seconds]...
[profiling] Current input file: tests/workloads/dispatch_0/MI100/perfmon/pmc_perf_3.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:07.133337 129433831694144 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.306670 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:07.143036 129433831694144 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:07.354732 129433831694144 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:07.486182 129433831694144 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.343147 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:07.525391 129433831694144 generateRocpd.cpp:582] writing SQL database for process 2390579 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:18:07.526630 129433831694144 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_0/MI100/out/pmc_1/dl385-20-mi100-3c48/2390579_results.db (UUID=0000432f-1492-7492-b9ee-1002929433a5)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:07.616967 129433831694144 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014563 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:07.618101 129433831694144 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001084 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:07.620307 129433831694144 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002178 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:07.625356 129433831694144 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003131 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:07.698644 129433831694144 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.073260 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:07.701440 129433831694144 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002763 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:07.701469 129433831694144 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:07.717284 129433831694144 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015797 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:07.717312 129433831694144 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:07.717328 129433831694144 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:07.717342 129433831694144 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:07.717540 129433831694144 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000186 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:07.717956 129433831694144 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.192565 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:07.724068 129433831694144 simple_timer.cpp:55] [rocprofv3] output generation ::     0.235485 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:07.724149 129433831694144 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.237917 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_0/MI100/out/pmc_1/2390579_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 5/12][Approximate profiling time left: 20 seconds]...
[profiling] Current input file: tests/workloads/dispatch_0/MI100/perfmon/pmc_perf_4.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:09.990522 127794332544832 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.303663 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:10.000163 127794332544832 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:10.211253 127794332544832 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:10.342172 127794332544832 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.342010 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:10.381124 127794332544832 generateRocpd.cpp:582] writing SQL database for process 2390591 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:18:10.382367 127794332544832 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_0/MI100/out/pmc_1/dl385-20-mi100-3c48/2390591_results.db (UUID=0000432f-1fbf-7fbf-bfb4-7ff8b3722df2)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:10.474147 127794332544832 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014537 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:10.475326 127794332544832 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001146 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:10.477561 127794332544832 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002206 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:10.482706 127794332544832 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003135 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:10.545771 127794332544832 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.063036 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:10.548543 127794332544832 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002742 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:10.548573 127794332544832 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:10.564309 127794332544832 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015721 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:10.564337 127794332544832 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:10.564349 127794332544832 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:10.564361 127794332544832 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:10.564565 127794332544832 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000186 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:10.564988 127794332544832 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.183865 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:10.570708 127794332544832 simple_timer.cpp:55] [rocprofv3] output generation ::     0.226098 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:10.570784 127794332544832 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.228561 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_0/MI100/out/pmc_1/2390591_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 6/12][Approximate profiling time left: 17 seconds]...
[profiling] Current input file: tests/workloads/dispatch_0/MI100/perfmon/pmc_perf_5.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:12.834064 128996637941568 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.306179 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:12.843782 128996637941568 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:13.055801 128996637941568 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:13.189447 128996637941568 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.345665 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:13.228909 128996637941568 generateRocpd.cpp:582] writing SQL database for process 2390602 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:18:13.230153 128996637941568 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_0/MI100/out/pmc_1/dl385-20-mi100-3c48/2390602_results.db (UUID=0000432f-2ad8-7ad8-9770-17436bcc2fdb)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:13.320928 128996637941568 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014454 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:13.322070 128996637941568 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001110 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:13.324540 128996637941568 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002441 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:13.329572 128996637941568 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003098 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:13.384189 128996637941568 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.054589 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:13.387029 128996637941568 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002809 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:13.387058 128996637941568 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000003 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:13.402334 128996637941568 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015261 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:13.402361 128996637941568 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:13.402373 128996637941568 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:13.402385 128996637941568 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:13.402592 128996637941568 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000187 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:13.403015 128996637941568 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.174106 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:13.408582 128996637941568 simple_timer.cpp:55] [rocprofv3] output generation ::     0.216727 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:13.408664 128996637941568 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.219160 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_0/MI100/out/pmc_1/2390602_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 7/12][Approximate profiling time left: 14 seconds]...
[profiling] Current input file: tests/workloads/dispatch_0/MI100/perfmon/pmc_perf_6.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:15.669065 138875414789952 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.303407 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:15.678916 138875414789952 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:15.889629 138875414789952 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:16.019055 138875414789952 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.340139 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:16.058048 138875414789952 generateRocpd.cpp:582] writing SQL database for process 2390613 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:18:16.059295 138875414789952 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_0/MI100/out/pmc_1/dl385-20-mi100-3c48/2390613_results.db (UUID=0000432f-35ee-75ee-9da2-4cb5a4b25c15)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:16.150484 138875414789952 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014304 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:16.151613 138875414789952 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001096 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:16.153790 138875414789952 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002149 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:16.158981 138875414789952 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003191 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:16.209937 138875414789952 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.050927 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:16.212837 138875414789952 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002871 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:16.212867 138875414789952 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000003 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:16.228563 138875414789952 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015682 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:16.228592 138875414789952 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:16.228604 138875414789952 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:16.228616 138875414789952 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:16.228826 138875414789952 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000189 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:16.229236 138875414789952 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.171189 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:16.234911 138875414789952 simple_timer.cpp:55] [rocprofv3] output generation ::     0.213430 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:16.234994 138875414789952 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.215889 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_0/MI100/out/pmc_1/2390613_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 8/12][Approximate profiling time left: 11 seconds]...
[profiling] Current input file: tests/workloads/dispatch_0/MI100/perfmon/pmc_perf_SQC_DCACHE_INFLIGHT_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:18.496315 140106786938688 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.303600 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:18.506319 140106786938688 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:18.717791 140106786938688 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:18.848175 140106786938688 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.341856 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:18.887088 140106786938688 generateRocpd.cpp:582] writing SQL database for process 2390623 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:18:18.888351 140106786938688 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_0/MI100/out/pmc_1/dl385-20-mi100-3c48/2390623_results.db (UUID=0000432f-40f9-70f9-a088-17cd56026b25)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:18.979889 140106786938688 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014790 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:18.981092 140106786938688 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001173 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:18.983644 140106786938688 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002524 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:18.988738 140106786938688 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003122 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:19.083271 140106786938688 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.094503 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:19.086122 140106786938688 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002820 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:19.086151 140106786938688 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:19.101888 140106786938688 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015722 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:19.101918 140106786938688 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:19.101930 140106786938688 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:19.101942 140106786938688 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:19.102164 140106786938688 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000208 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:19.102666 140106786938688 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.215579 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:19.108552 140106786938688 simple_timer.cpp:55] [rocprofv3] output generation ::     0.258010 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:19.108645 140106786938688 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.260420 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_0/MI100/out/pmc_1/2390623_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 9/12][Approximate profiling time left: 8 seconds]...
[profiling] Current input file: tests/workloads/dispatch_0/MI100/perfmon/pmc_perf_SQC_ICACHE_INFLIGHT_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:21.383571 124202147913536 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.303081 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:21.393361 124202147913536 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:21.604995 124202147913536 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:21.735613 124202147913536 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.342252 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:21.774952 124202147913536 generateRocpd.cpp:582] writing SQL database for process 2390645 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:18:21.776197 124202147913536 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_0/MI100/out/pmc_1/dl385-20-mi100-3c48/2390645_results.db (UUID=0000432f-4c41-7c41-b794-b3df5772c7a5)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:21.866620 124202147913536 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014506 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:21.867722 124202147913536 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001072 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:21.870192 124202147913536 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002443 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:21.875242 124202147913536 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003112 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:21.968037 124202147913536 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.092765 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:21.970896 124202147913536 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002829 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:21.970926 124202147913536 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000003 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:21.986534 124202147913536 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015594 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:21.986562 124202147913536 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:21.986574 124202147913536 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:21.986585 124202147913536 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:21.986778 124202147913536 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000177 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:21.987229 124202147913536 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.212278 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:21.993058 124202147913536 simple_timer.cpp:55] [rocprofv3] output generation ::     0.255052 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:21.993142 124202147913536 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.257482 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_0/MI100/out/pmc_1/2390645_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 10/12][Approximate profiling time left: 5 seconds]...
[profiling] Current input file: tests/workloads/dispatch_0/MI100/perfmon/pmc_perf_SQ_IFETCH_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:24.296799 125873561157440 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.312020 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:24.306547 125873561157440 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:24.517834 125873561157440 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:24.652212 125873561157440 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.345666 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:24.684338 125873561157440 generateRocpd.cpp:582] writing SQL database for process 2390655 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:18:24.685312 125873561157440 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_0/MI100/out/pmc_1/dl385-20-mi100-3c48/2390655_results.db (UUID=0000432f-5799-7799-9e9d-b20ac7d53c89)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:24.761579 125873561157440 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.011376 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:24.762563 125873561157440 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.000960 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:24.764627 125873561157440 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002044 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:24.768839 125873561157440 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.002532 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:24.872935 125873561157440 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.104074 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:24.875377 125873561157440 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002418 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:24.875399 125873561157440 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:24.887801 125873561157440 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.012391 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:24.887821 125873561157440 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:24.887830 125873561157440 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:24.887839 125873561157440 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:24.887999 125873561157440 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000149 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:24.888348 125873561157440 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.204011 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:24.893074 125873561157440 simple_timer.cpp:55] [rocprofv3] output generation ::     0.238370 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:24.893148 125873561157440 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.240885 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_0/MI100/out/pmc_1/2390655_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 11/12][Approximate profiling time left: 2 seconds]...
[profiling] Current input file: tests/workloads/dispatch_0/MI100/perfmon/pmc_perf_SQ_INST_LEVEL_LDS_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:27.187478 136464833425216 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.309913 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:27.197315 136464833425216 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:27.403808 136464833425216 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:27.536152 136464833425216 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.338837 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:27.575282 136464833425216 generateRocpd.cpp:582] writing SQL database for process 2390666 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:18:27.576554 136464833425216 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_0/MI100/out/pmc_1/dl385-20-mi100-3c48/2390666_results.db (UUID=0000432f-62e6-72e6-a2f8-45a6a115665a)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:27.667211 136464833425216 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.015101 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:27.668352 136464833425216 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001110 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:27.670886 136464833425216 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002506 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:27.675947 136464833425216 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003093 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:27.801099 136464833425216 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.125111 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:27.803932 136464833425216 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002804 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:27.803961 136464833425216 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000003 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:27.820483 136464833425216 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.016497 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:27.820510 136464833425216 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:27.820522 136464833425216 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:27.820534 136464833425216 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:27.820751 136464833425216 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000195 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:27.821208 136464833425216 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.245927 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:27.827109 136464833425216 simple_timer.cpp:55] [rocprofv3] output generation ::     0.288564 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:27.827194 136464833425216 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.290992 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_0/MI100/out/pmc_1/2390666_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 12/12][Approximate profiling time left: 0 seconds]...
[profiling] Current input file: tests/workloads/dispatch_0/MI100/perfmon/pmc_perf_SQ_LEVEL_WAVES_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:30.090838 139044428951360 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.305127 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:30.100845 139044428951360 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:30.313483 139044428951360 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:30.446012 139044428951360 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.345167 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:30.485410 139044428951360 generateRocpd.cpp:582] writing SQL database for process 2390676 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:18:30.486662 139044428951360 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_0/MI100/out/pmc_1/dl385-20-mi100-3c48/2390676_results.db (UUID=0000432f-6e42-7e42-85c2-980cde97c8ec)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:30.577464 139044428951360 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014368 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:30.578588 139044428951360 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001094 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:30.581090 139044428951360 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002473 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:30.586103 139044428951360 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003136 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:30.663035 139044428951360 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.076903 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:30.665905 139044428951360 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002839 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:30.665934 139044428951360 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:30.681650 139044428951360 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015702 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:30.681678 139044428951360 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:30.681690 139044428951360 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:30.681702 139044428951360 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:30.681909 139044428951360 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000189 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:30.682362 139044428951360 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.196952 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:30.688070 139044428951360 simple_timer.cpp:55] [rocprofv3] output generation ::     0.239636 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:30.688148 139044428951360 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.242088 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_0/MI100/out/pmc_1/2390676_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
PC sampling data collection skipped as block 21 is not specified.
[roofline] Skipping roofline
