Following counters might not be supported by rocprof: SQ_INSTS_VALU_ADD_F32, SQ_INSTS_VALU_FMA_F32, SQ_INSTS_VALU_ADD_F64, SQ_INSTS_VALU_TRANS_F32, SQ_INSTS_VALU_ADD_F16, SQ_INSTS_VALU_MUL_F32, SQ_INSTS_VALU_MFMA_MOPS_I8, SQ_INSTS_VALU_MFMA_MOPS_F64, SQ_INSTS_VALU_MUL_F16, SQ_INSTS_VALU_TRANS_F64, SQ_INSTS_VALU_MFMA_MOPS_F16, SQ_INSTS_VALU_MFMA_MOPS_BF16, SQ_INSTS_VALU_MUL_F64, SQ_INSTS_VALU_MFMA_MOPS_F32, SQ_INSTS_VALU_FMA_F64, SQ_INSTS_VALU_TRANS_F16, SQ_INSTS_VALU_FMA_F16
Rocprofiler-Compute version: 3.7.0
Profiler choice: rocprofiler-sdk
Output directory: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_inv/MI100
Target: MI100
Command: ./tests/vcopy -n 1048576 -b 256 -i 3
Kernel Selection: None
Dispatch Selection: None
Filtered sections: All

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Collecting Performance Counters
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Generating native tool project using command: cmake -S /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib -B /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
-- Checking for module 'libdw'
--   Package 'libdw', required by 'virtual:world', not found
-- Could NOT find libdw (missing: libdw_LIBRARY libdw_INCLUDE_DIR)
-- {fmt} version: 12.1.0
-- Build type:
-- Configuring done (0.2s)
-- Generating done (0.0s)
-- Build files have been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
Building native tool using command: cmake --build /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build --parallel
[  0%] Built target gsl_assert
[ 33%] Built target fmt
[100%] Built target rocprofiler-compute-tool
Searching /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src by lib/_build/lib/librocprofiler-compute-tool.so for native collector
Using native collector: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
Using native counter collection tool: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
[profiling] Iteration multiplexing: Disabled
[Run 1/12][Approximate profiling time left: pending first measurement...]
[profiling] Current input file: tests/workloads/dispatch_inv/MI100/perfmon/pmc_perf_0.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:24.178318 127526576209728 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.313778 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:24.188427 127526576209728 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:24.403552 127526576209728 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] [33m[rocprofiler-compute] [create_counter_collection_profile] WARNING: Requested counters not available: SQ_INSTS_VALU_ADD_F16, SQ_INSTS_VALU_ADD_F32, SQ_INSTS_VALU_ADD_F64, SQ_INSTS_VALU_FMA_F16[0m
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:24.539095 127526576209728 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.350669 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:24.579120 127526576209728 generateRocpd.cpp:582] writing SQL database for process 2390946 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:19:24.580377 127526576209728 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_inv/MI100/out/pmc_1/dl385-20-mi100-3c48/2390946_results.db (UUID=00004330-4180-7180-b0d9-0e1761395c58)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:24.663633 127526576209728 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014033 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:24.664655 127526576209728 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.000991 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:24.667067 127526576209728 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002383 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:24.671742 127526576209728 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.002864 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:24.744062 127526576209728 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.072290 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:24.747109 127526576209728 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.003010 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:24.747139 127526576209728 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:24.763482 127526576209728 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.016325 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:24.763513 127526576209728 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:24.763525 127526576209728 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:24.763537 127526576209728 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:24.763746 127526576209728 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000196 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:24.764327 127526576209728 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.185208 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:24.770327 127526576209728 simple_timer.cpp:55] [rocprofv3] output generation ::     0.228791 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:24.770415 127526576209728 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.231269 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_inv/MI100/out/pmc_1/2390946_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 2/12][Approximate profiling time left: 33 seconds]...
[profiling] Current input file: tests/workloads/dispatch_inv/MI100/perfmon/pmc_perf_1.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:27.060999 131590597095232 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.307554 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:27.070096 131590597095232 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:27.281277 131590597095232 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] [33m[rocprofiler-compute] [create_counter_collection_profile] WARNING: Requested counters not available: SQ_INSTS_VALU_FMA_F32, SQ_INSTS_VALU_FMA_F64, SQ_INSTS_VALU_MFMA_MOPS_BF16, SQ_INSTS_VALU_MFMA_MOPS_F16, SQ_INSTS_VALU_MFMA_MOPS_F32, SQ_INSTS_VALU_MFMA_MOPS_F64, SQ_INSTS_VALU_MFMA_MOPS_I8, SQ_INSTS_VALU_MUL_F16[0m
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:27.416298 131590597095232 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.346202 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:27.448158 131590597095232 generateRocpd.cpp:582] writing SQL database for process 2390956 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:19:27.449143 131590597095232 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_inv/MI100/out/pmc_1/dl385-20-mi100-3c48/2390956_results.db (UUID=00004330-4cc9-7cc9-a704-b59130dd8019)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:27.521938 131590597095232 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.010830 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:27.522895 131590597095232 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.000933 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:27.524678 131590597095232 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001759 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:27.528791 131590597095232 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.002433 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:27.577206 131590597095232 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.048394 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:27.579593 131590597095232 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002364 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:27.579616 131590597095232 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:27.591991 131590597095232 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.012364 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:27.592013 131590597095232 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:27.592022 131590597095232 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:27.592031 131590597095232 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:27.592180 131590597095232 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000137 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:27.592486 131590597095232 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.144328 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:27.596781 131590597095232 simple_timer.cpp:55] [rocprofv3] output generation ::     0.177975 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:27.596843 131590597095232 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.180490 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_inv/MI100/out/pmc_1/2390956_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 3/12][Approximate profiling time left: 27 seconds]...
[profiling] Current input file: tests/workloads/dispatch_inv/MI100/perfmon/pmc_perf_2.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:29.871135 137058366930752 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.306443 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:29.880285 137058366930752 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:30.091870 137058366930752 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] [33m[rocprofiler-compute] [create_counter_collection_profile] WARNING: Requested counters not available: SQ_INSTS_VALU_MUL_F32, SQ_INSTS_VALU_MUL_F64, SQ_INSTS_VALU_TRANS_F16, SQ_INSTS_VALU_TRANS_F32, SQ_INSTS_VALU_TRANS_F64[0m
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:30.225519 137058366930752 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.345234 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:30.264701 137058366930752 generateRocpd.cpp:582] writing SQL database for process 2390969 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:19:30.265934 137058366930752 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_inv/MI100/out/pmc_1/dl385-20-mi100-3c48/2390969_results.db (UUID=00004330-57c5-77c5-aae9-36433e5a2bab)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:30.355630 137058366930752 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014522 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:30.356727 137058366930752 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001066 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:30.358896 137058366930752 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002140 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:30.363931 137058366930752 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003109 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:30.432223 137058366930752 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.068263 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:30.435074 137058366930752 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002818 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:30.435103 137058366930752 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:30.450807 137058366930752 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015690 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:30.450836 137058366930752 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:30.450847 137058366930752 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:30.450859 137058366930752 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:30.451083 137058366930752 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000203 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:30.451527 137058366930752 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.186828 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:30.457166 137058366930752 simple_timer.cpp:55] [rocprofv3] output generation ::     0.229280 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:30.457255 137058366930752 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.231684 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_inv/MI100/out/pmc_1/2390969_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 4/12][Approximate profiling time left: 24 seconds]...
[profiling] Current input file: tests/workloads/dispatch_inv/MI100/perfmon/pmc_perf_3.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:32.749609 128886021795648 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.304651 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:32.759442 128886021795648 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:32.971246 128886021795648 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:33.104172 128886021795648 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.344731 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:33.143527 128886021795648 generateRocpd.cpp:582] writing SQL database for process 2390981 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:19:33.144869 128886021795648 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_inv/MI100/out/pmc_1/dl385-20-mi100-3c48/2390981_results.db (UUID=00004330-6305-7305-9b03-1db63fb28e4d)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:33.227344 128886021795648 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.013958 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:33.228407 128886021795648 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001032 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:33.230570 128886021795648 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002135 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:33.235289 128886021795648 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.002910 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:33.307096 128886021795648 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.071772 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:33.309758 128886021795648 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002627 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:33.309787 128886021795648 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:33.325918 128886021795648 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.016116 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:33.325947 128886021795648 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:33.325960 128886021795648 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:33.325984 128886021795648 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000013 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:33.326197 128886021795648 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000200 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:33.326729 128886021795648 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.183202 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:33.332576 128886021795648 simple_timer.cpp:55] [rocprofv3] output generation ::     0.226027 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:33.332661 128886021795648 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.228439 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_inv/MI100/out/pmc_1/2390981_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 5/12][Approximate profiling time left: 20 seconds]...
[profiling] Current input file: tests/workloads/dispatch_inv/MI100/perfmon/pmc_perf_4.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:35.625942 124689743880000 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.306728 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:35.635424 124689743880000 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:35.846268 124689743880000 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:35.979916 124689743880000 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.344492 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:36.019187 124689743880000 generateRocpd.cpp:582] writing SQL database for process 2390991 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:19:36.020408 124689743880000 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_inv/MI100/out/pmc_1/dl385-20-mi100-3c48/2390991_results.db (UUID=00004330-6e3f-7e3f-843e-78959e113cb8)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:36.110135 124689743880000 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014416 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:36.111280 124689743880000 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001113 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:36.113447 124689743880000 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002132 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:36.118432 124689743880000 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003102 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:36.181591 124689743880000 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.063131 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:36.184390 124689743880000 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002766 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:36.184419 124689743880000 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:36.199985 124689743880000 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015551 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:36.200014 124689743880000 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:36.200026 124689743880000 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:36.200038 124689743880000 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:36.200231 124689743880000 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000177 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:36.200696 124689743880000 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.181510 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:36.206414 124689743880000 simple_timer.cpp:55] [rocprofv3] output generation ::     0.224063 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:36.206500 124689743880000 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.226531 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_inv/MI100/out/pmc_1/2390991_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 6/12][Approximate profiling time left: 17 seconds]...
[profiling] Current input file: tests/workloads/dispatch_inv/MI100/perfmon/pmc_perf_5.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:38.496636 139748629434176 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.316537 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:38.506333 139748629434176 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:38.721483 139748629434176 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:38.856801 139748629434176 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.350468 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:38.896492 139748629434176 generateRocpd.cpp:582] writing SQL database for process 2391001 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:19:38.897736 139748629434176 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_inv/MI100/out/pmc_1/dl385-20-mi100-3c48/2391001_results.db (UUID=00004330-796c-796c-8e2b-8bf8ad034876)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:38.979903 139748629434176 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.013955 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:38.980947 139748629434176 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001015 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:38.983342 139748629434176 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002353 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:38.987992 139748629434176 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.002887 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:39.043244 139748629434176 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.055218 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:39.045913 139748629434176 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002629 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:39.045945 139748629434176 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:39.061880 139748629434176 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015920 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:39.061907 139748629434176 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:39.061923 139748629434176 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:39.061937 139748629434176 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:39.062149 139748629434176 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000199 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:39.062640 139748629434176 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.166149 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:39.068538 139748629434176 simple_timer.cpp:55] [rocprofv3] output generation ::     0.209296 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:39.068613 139748629434176 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.211761 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_inv/MI100/out/pmc_1/2391001_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 7/12][Approximate profiling time left: 14 seconds]...
[profiling] Current input file: tests/workloads/dispatch_inv/MI100/perfmon/pmc_perf_6.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:41.355210 128510004449088 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.308905 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:41.363639 128510004449088 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:41.578919 128510004449088 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:41.711085 128510004449088 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.347446 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:41.751063 128510004449088 generateRocpd.cpp:582] writing SQL database for process 2391011 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:19:41.752321 128510004449088 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_inv/MI100/out/pmc_1/dl385-20-mi100-3c48/2391011_results.db (UUID=00004330-849e-749e-8d50-57f3e4908758)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:41.833630 128510004449088 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.013703 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:41.834646 128510004449088 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.000985 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:41.836779 128510004449088 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002105 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:41.841693 128510004449088 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003076 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:41.894556 128510004449088 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.052835 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:41.897248 128510004449088 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002648 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:41.897281 128510004449088 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:41.913661 128510004449088 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.016364 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:41.913703 128510004449088 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:41.913719 128510004449088 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:41.913744 128510004449088 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:41.914079 128510004449088 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000310 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:41.914649 128510004449088 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.163587 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:41.920713 128510004449088 simple_timer.cpp:55] [rocprofv3] output generation ::     0.207228 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:41.920800 128510004449088 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.209665 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_inv/MI100/out/pmc_1/2391011_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 8/12][Approximate profiling time left: 11 seconds]...
[profiling] Current input file: tests/workloads/dispatch_inv/MI100/perfmon/pmc_perf_SQC_DCACHE_INFLIGHT_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:44.195383 135282148515648 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.306862 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:44.205006 135282148515648 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:44.415314 135282148515648 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:44.548332 135282148515648 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.343326 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:44.588723 135282148515648 generateRocpd.cpp:582] writing SQL database for process 2391022 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:19:44.589985 135282148515648 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_inv/MI100/out/pmc_1/dl385-20-mi100-3c48/2391022_results.db (UUID=00004330-8fb9-7fb9-8dba-1a70fd9ee1db)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:44.672938 135282148515648 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014143 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:44.674007 135282148515648 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001039 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:44.676358 135282148515648 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002322 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:44.680953 135282148515648 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.002793 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:44.775270 135282148515648 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.094276 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:44.778012 135282148515648 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002713 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:44.778041 135282148515648 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:44.794185 135282148515648 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.016129 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:44.794217 135282148515648 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:44.794229 135282148515648 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:44.794241 135282148515648 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:44.794459 135282148515648 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000204 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:44.795083 135282148515648 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.206360 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:44.801263 135282148515648 simple_timer.cpp:55] [rocprofv3] output generation ::     0.250512 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:44.801358 135282148515648 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.252975 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_inv/MI100/out/pmc_1/2391022_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 9/12][Approximate profiling time left: 8 seconds]...
[profiling] Current input file: tests/workloads/dispatch_inv/MI100/perfmon/pmc_perf_SQC_ICACHE_INFLIGHT_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:47.114101 139339294641984 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.306270 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:47.122709 139339294641984 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:47.333707 139339294641984 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:47.466142 139339294641984 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.343433 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:47.505534 139339294641984 generateRocpd.cpp:582] writing SQL database for process 2391033 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:19:47.506767 139339294641984 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_inv/MI100/out/pmc_1/dl385-20-mi100-3c48/2391033_results.db (UUID=00004330-9b20-7b20-a2a3-554f601d11d4)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:47.590295 139339294641984 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.013906 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:47.591345 139339294641984 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001021 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:47.593804 139339294641984 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002430 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:47.598496 139339294641984 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.002830 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:47.691093 139339294641984 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.092571 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:47.693819 139339294641984 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002687 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:47.693851 139339294641984 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:47.710084 139339294641984 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.016217 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:47.710116 139339294641984 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:47.710128 139339294641984 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:47.710140 139339294641984 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:47.710363 139339294641984 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000203 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:47.711033 139339294641984 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.205500 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:47.716933 139339294641984 simple_timer.cpp:55] [rocprofv3] output generation ::     0.248388 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:47.717032 139339294641984 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.250840 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_inv/MI100/out/pmc_1/2391033_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 10/12][Approximate profiling time left: 5 seconds]...
[profiling] Current input file: tests/workloads/dispatch_inv/MI100/perfmon/pmc_perf_SQ_IFETCH_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:50.043214 139896128757568 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.311506 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:50.051705 139896128757568 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:50.263343 139896128757568 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:50.396990 139896128757568 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.345285 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:50.436740 139896128757568 generateRocpd.cpp:582] writing SQL database for process 2391043 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:19:50.438015 139896128757568 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_inv/MI100/out/pmc_1/dl385-20-mi100-3c48/2391043_results.db (UUID=00004330-a68c-768c-aaf1-864d244ba3c6)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:50.521433 139896128757568 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014227 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:50.522462 139896128757568 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:50.524867 139896128757568 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002376 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:50.529663 139896128757568 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.002950 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:50.664491 139896128757568 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.134799 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:50.667090 139896128757568 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002569 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:50.667119 139896128757568 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:50.682737 139896128757568 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015603 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:50.682767 139896128757568 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:50.682779 139896128757568 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:50.682791 139896128757568 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:50.683039 139896128757568 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000228 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:50.683597 139896128757568 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.246858 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:50.689558 139896128757568 simple_timer.cpp:55] [rocprofv3] output generation ::     0.290144 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:50.689656 139896128757568 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.292618 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_inv/MI100/out/pmc_1/2391043_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 11/12][Approximate profiling time left: 2 seconds]...
[profiling] Current input file: tests/workloads/dispatch_inv/MI100/perfmon/pmc_perf_SQ_INST_LEVEL_LDS_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:53.013917 133219245186880 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.313744 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:53.023895 133219245186880 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:53.235236 133219245186880 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:53.371142 133219245186880 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.347247 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:53.410376 133219245186880 generateRocpd.cpp:582] writing SQL database for process 2391054 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:19:53.411619 133219245186880 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_inv/MI100/out/pmc_1/dl385-20-mi100-3c48/2391054_results.db (UUID=00004330-b224-7224-a000-3fcefcca98fc)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:53.501523 133219245186880 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014994 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:53.502630 133219245186880 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001075 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:53.505096 133219245186880 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002438 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:53.510118 133219245186880 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003087 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:53.634371 133219245186880 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.124224 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:53.637128 133219245186880 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002726 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:53.637158 133219245186880 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:53.652961 133219245186880 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015788 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:53.652999 133219245186880 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:53.653012 133219245186880 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:53.653023 133219245186880 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:53.653238 133219245186880 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000197 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:53.653736 133219245186880 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.243361 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:53.659667 133219245186880 simple_timer.cpp:55] [rocprofv3] output generation ::     0.286228 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:53.659764 133219245186880 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.288570 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_inv/MI100/out/pmc_1/2391054_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 12/12][Approximate profiling time left: 0 seconds]...
[profiling] Current input file: tests/workloads/dispatch_inv/MI100/perfmon/pmc_perf_SQ_LEVEL_WAVES_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:55.947512 128362616549184 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.303840 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:55.956800 128362616549184 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:56.164647 128362616549184 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:56.296083 128362616549184 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.339284 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:56.334530 128362616549184 generateRocpd.cpp:582] writing SQL database for process 2391064 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:19:56.335758 128362616549184 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_inv/MI100/out/pmc_1/dl385-20-mi100-3c48/2391064_results.db (UUID=00004330-bda4-7da4-8c13-24c4572f9f8c)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:56.417342 128362616549184 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014039 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:56.418370 128362616549184 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.000997 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:56.420715 128362616549184 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002317 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:56.425359 128362616549184 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.002809 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:56.501933 128362616549184 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.076545 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:56.504757 128362616549184 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002795 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:56.504786 128362616549184 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:56.520557 128362616549184 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015757 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:56.520585 128362616549184 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:56.520597 128362616549184 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:56.520609 128362616549184 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:56.520808 128362616549184 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000186 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:56.521255 128362616549184 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.186725 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:56.527059 128362616549184 simple_timer.cpp:55] [rocprofv3] output generation ::     0.228565 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:56.527140 128362616549184 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.231005 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_inv/MI100/out/pmc_1/2391064_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
PC sampling data collection skipped as block 21 is not specified.
[roofline] Skipping roofline
