Following counters might not be supported by rocprof: SQ_INSTS_VALU_MFMA_MOPS_F16, SQ_INSTS_VALU_ADD_F32, SQ_INSTS_VALU_MUL_F16, SQ_INSTS_VALU_MUL_F64, SQ_INSTS_VALU_ADD_F64, SQ_INSTS_VALU_MFMA_MOPS_F64, SQ_INSTS_VALU_MFMA_MOPS_F32, SQ_INSTS_VALU_MFMA_MOPS_I8, SQ_INSTS_VALU_FMA_F32, SQ_INSTS_VALU_TRANS_F32, SQ_INSTS_VALU_TRANS_F64, SQ_INSTS_VALU_TRANS_F16, SQ_INSTS_VALU_FMA_F64, SQ_INSTS_VALU_MFMA_MOPS_BF16, SQ_INSTS_VALU_MUL_F32, SQ_INSTS_VALU_FMA_F16, SQ_INSTS_VALU_ADD_F16
Rocprofiler-Compute version: 3.7.0
Profiler choice: rocprofiler-sdk
Output directory: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/dispatch_0_1/MI100
Target: MI100
Command: ./tests/vcopy -n 1048576 -b 256 -i 3
Kernel Selection: None
Dispatch Selection: ['1:3']
Filtered sections: All

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Collecting Performance Counters
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Generating native tool project using command: cmake -S /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib -B /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
-- Checking for module 'libdw'
--   Package 'libdw', required by 'virtual:world', not found
-- Could NOT find libdw (missing: libdw_LIBRARY libdw_INCLUDE_DIR)
-- {fmt} version: 12.1.0
-- Build type:
-- Configuring done (0.2s)
-- Generating done (0.0s)
-- Build files have been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
Building native tool using command: cmake --build /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build --parallel
[  0%] Built target gsl_assert
[ 33%] Built target fmt
[100%] Built target rocprofiler-compute-tool
Searching /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src by lib/_build/lib/librocprofiler-compute-tool.so for native collector
Using native collector: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
Using native counter collection tool: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
[profiling] Iteration multiplexing: Disabled
[Run 1/12][Approximate profiling time left: pending first measurement...]
[profiling] Current input file: tests/workloads/dispatch_0_1/MI100/perfmon/pmc_perf_0.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:37.089691 123638486232896 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.305858 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:37.099600 123638486232896 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:37.315258 123638486232896 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] [33m[rocprofiler-compute] [create_counter_collection_profile] WARNING: Requested counters not available: SQ_INSTS_VALU_ADD_F16, SQ_INSTS_VALU_ADD_F32, SQ_INSTS_VALU_ADD_F64, SQ_INSTS_VALU_FMA_F16[0m
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:37.452335 123638486232896 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.352735 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:37.491404 123638486232896 generateRocpd.cpp:582] writing SQL database for process 2390737 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:18:37.492663 123638486232896 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_0_1/MI100/out/pmc_1/dl385-20-mi100-3c48/2390737_results.db (UUID=0000432f-8998-7998-9f85-bf0229f1a408)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:37.584037 123638486232896 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014559 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:37.585157 123638486232896 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001089 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:37.587767 123638486232896 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002582 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:37.592869 123638486232896 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003122 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:37.662229 123638486232896 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.069332 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:37.665104 123638486232896 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002841 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:37.665133 123638486232896 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:37.680772 123638486232896 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015625 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:37.680801 123638486232896 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:37.680813 123638486232896 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:37.680824 123638486232896 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:37.681042 123638486232896 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000200 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:37.681477 123638486232896 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.190074 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:37.687753 123638486232896 simple_timer.cpp:55] [rocprofv3] output generation ::     0.232937 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:37.687835 123638486232896 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.235450 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_0_1/MI100/out/pmc_1/2390737_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 2/12][Approximate profiling time left: 33 seconds]...
[profiling] Current input file: tests/workloads/dispatch_0_1/MI100/perfmon/pmc_perf_1.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:39.990933 137707911999296 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.304622 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:40.000803 137707911999296 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:40.212268 137707911999296 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] [33m[rocprofiler-compute] [create_counter_collection_profile] WARNING: Requested counters not available: SQ_INSTS_VALU_FMA_F32, SQ_INSTS_VALU_FMA_F64, SQ_INSTS_VALU_MFMA_MOPS_BF16, SQ_INSTS_VALU_MFMA_MOPS_F16, SQ_INSTS_VALU_MFMA_MOPS_F32, SQ_INSTS_VALU_MFMA_MOPS_F64, SQ_INSTS_VALU_MFMA_MOPS_I8, SQ_INSTS_VALU_MUL_F16[0m
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:40.342941 137707911999296 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.342138 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:40.381591 137707911999296 generateRocpd.cpp:582] writing SQL database for process 2390749 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:18:40.382820 137707911999296 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_0_1/MI100/out/pmc_1/dl385-20-mi100-3c48/2390749_results.db (UUID=0000432f-94ee-74ee-a221-8691a81c8a4b)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:40.473618 137707911999296 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014440 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:40.474774 137707911999296 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001125 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:40.477012 137707911999296 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002210 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:40.482118 137707911999296 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003113 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:40.544899 137707911999296 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.062753 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:40.547851 137707911999296 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002922 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:40.547880 137707911999296 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:40.563367 137707911999296 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015472 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:40.563398 137707911999296 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:40.563410 137707911999296 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:40.563422 137707911999296 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:40.563628 137707911999296 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000188 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:40.564055 137707911999296 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.182465 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:40.569967 137707911999296 simple_timer.cpp:55] [rocprofv3] output generation ::     0.224610 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:40.570063 137707911999296 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.227059 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_0_1/MI100/out/pmc_1/2390749_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 3/12][Approximate profiling time left: 28 seconds]...
[profiling] Current input file: tests/workloads/dispatch_0_1/MI100/perfmon/pmc_perf_2.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:42.864535 135633830240064 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.304117 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:42.874563 135633830240064 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:43.086682 135633830240064 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] [33m[rocprofiler-compute] [create_counter_collection_profile] WARNING: Requested counters not available: SQ_INSTS_VALU_MUL_F32, SQ_INSTS_VALU_MUL_F64, SQ_INSTS_VALU_TRANS_F16, SQ_INSTS_VALU_TRANS_F32, SQ_INSTS_VALU_TRANS_F64[0m
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:43.219362 135633830240064 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.344799 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:43.258822 135633830240064 generateRocpd.cpp:582] writing SQL database for process 2390761 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:18:43.260099 135633830240064 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_0_1/MI100/out/pmc_1/dl385-20-mi100-3c48/2390761_results.db (UUID=0000432f-a028-7028-b3c6-368614f5b525)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:43.347442 135633830240064 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014576 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:43.348528 135633830240064 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001055 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:43.350702 135633830240064 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002146 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:43.355669 135633830240064 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003086 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:43.423816 135633830240064 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.068118 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:43.426682 135633830240064 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002838 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:43.426711 135633830240064 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000003 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:43.442598 135633830240064 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015873 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:43.442629 135633830240064 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:43.442640 135633830240064 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:43.442652 135633830240064 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:43.442868 135633830240064 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000201 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:43.443553 135633830240064 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.184731 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:43.449513 135633830240064 simple_timer.cpp:55] [rocprofv3] output generation ::     0.227735 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:43.449601 135633830240064 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.230190 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_0_1/MI100/out/pmc_1/2390761_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 4/12][Approximate profiling time left: 24 seconds]...
[profiling] Current input file: tests/workloads/dispatch_0_1/MI100/perfmon/pmc_perf_3.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:45.757752 131512715276096 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.304021 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:45.767702 131512715276096 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:45.979231 131512715276096 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:46.110782 131512715276096 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.343080 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:46.149929 131512715276096 generateRocpd.cpp:582] writing SQL database for process 2390773 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:18:46.151173 131512715276096 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_0_1/MI100/out/pmc_1/dl385-20-mi100-3c48/2390773_results.db (UUID=0000432f-ab76-7b76-88a6-0dfb43913fee)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:46.242607 131512715276096 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014538 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:46.243719 131512715276096 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001081 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:46.245952 131512715276096 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002204 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:46.251073 131512715276096 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003147 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:46.323455 131512715276096 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.072353 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:46.326239 131512715276096 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002755 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:46.326268 131512715276096 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:46.342309 131512715276096 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.016026 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:46.342338 131512715276096 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:46.342350 131512715276096 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:46.342362 131512715276096 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:46.342575 131512715276096 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000198 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:46.343124 131512715276096 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.193196 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:46.349054 131512715276096 simple_timer.cpp:55] [rocprofv3] output generation ::     0.235880 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:46.349149 131512715276096 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.238317 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_0_1/MI100/out/pmc_1/2390773_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 5/12][Approximate profiling time left: 21 seconds]...
[profiling] Current input file: tests/workloads/dispatch_0_1/MI100/perfmon/pmc_perf_4.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:48.653209 131486684954432 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.304109 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:48.662922 131486684954432 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:48.874019 131486684954432 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:49.003699 131486684954432 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.340777 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:49.042962 131486684954432 generateRocpd.cpp:582] writing SQL database for process 2390784 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:18:49.044213 131486684954432 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_0_1/MI100/out/pmc_1/dl385-20-mi100-3c48/2390784_results.db (UUID=0000432f-b6c5-76c5-be7c-dbfbfe45c100)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:49.135783 131486684954432 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014660 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:49.136949 131486684954432 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001135 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:49.139218 131486684954432 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002227 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:49.144354 131486684954432 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003170 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:49.207634 131486684954432 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.063251 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:49.210543 131486684954432 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002879 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:49.210573 131486684954432 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000003 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:49.226662 131486684954432 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.016074 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:49.226694 131486684954432 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:49.226706 131486684954432 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:49.226718 131486684954432 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:49.226947 131486684954432 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000208 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:49.227588 131486684954432 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.184627 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:49.233520 131486684954432 simple_timer.cpp:55] [rocprofv3] output generation ::     0.227417 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:49.233621 131486684954432 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.229870 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_0_1/MI100/out/pmc_1/2390784_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 6/12][Approximate profiling time left: 17 seconds]...
[profiling] Current input file: tests/workloads/dispatch_0_1/MI100/perfmon/pmc_perf_5.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:51.507399 133071372697408 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.305582 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:51.517172 133071372697408 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:51.732184 133071372697408 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:51.862106 133071372697408 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.344934 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:51.901351 133071372697408 generateRocpd.cpp:582] writing SQL database for process 2390794 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:18:51.902609 133071372697408 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_0_1/MI100/out/pmc_1/dl385-20-mi100-3c48/2390794_results.db (UUID=0000432f-c1ea-71ea-bbe9-ef0d212d15f5)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:51.991114 133071372697408 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014322 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:51.992264 133071372697408 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001119 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:51.994775 133071372697408 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002483 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:51.999816 133071372697408 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003119 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:52.053841 133071372697408 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.053996 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:52.056618 133071372697408 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002747 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:52.056647 133071372697408 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:52.072185 133071372697408 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015523 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:52.072216 133071372697408 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:52.072229 133071372697408 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:52.072241 133071372697408 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:52.072468 133071372697408 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000204 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:52.073028 133071372697408 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.171677 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:52.079074 133071372697408 simple_timer.cpp:55] [rocprofv3] output generation ::     0.214540 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:52.079163 133071372697408 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.216999 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_0_1/MI100/out/pmc_1/2390794_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 7/12][Approximate profiling time left: 14 seconds]...
[profiling] Current input file: tests/workloads/dispatch_0_1/MI100/perfmon/pmc_perf_6.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:54.384838 131180601548608 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.303680 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:54.394270 131180601548608 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:54.604962 131180601548608 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:54.736841 131180601548608 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.342572 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:54.775546 131180601548608 generateRocpd.cpp:582] writing SQL database for process 2390804 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:18:54.776813 131180601548608 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_0_1/MI100/out/pmc_1/dl385-20-mi100-3c48/2390804_results.db (UUID=0000432f-cd29-7d29-87f7-ae2266b86059)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:54.867724 131180601548608 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014266 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:54.868887 131180601548608 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001131 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:54.871144 131180601548608 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002228 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:54.876214 131180601548608 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003131 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:54.927243 131180601548608 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.051000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:54.930076 131180601548608 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002803 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:54.930105 131180601548608 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:54.945669 131180601548608 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015550 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:54.945697 131180601548608 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:54.945709 131180601548608 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:54.945721 131180601548608 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:54.945919 131180601548608 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000180 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:54.946360 131180601548608 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.170815 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:54.952165 131180601548608 simple_timer.cpp:55] [rocprofv3] output generation ::     0.212923 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:54.952246 131180601548608 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.215356 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_0_1/MI100/out/pmc_1/2390804_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 8/12][Approximate profiling time left: 11 seconds]...
[profiling] Current input file: tests/workloads/dispatch_0_1/MI100/perfmon/pmc_perf_SQC_DCACHE_INFLIGHT_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:57.266160 128527832072000 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.305060 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:57.275784 128527832072000 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:57.487248 128527832072000 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:18:57.620170 128527832072000 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.344387 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:57.659506 128527832072000 generateRocpd.cpp:582] writing SQL database for process 2390814 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:18:57.660732 128527832072000 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_0_1/MI100/out/pmc_1/dl385-20-mi100-3c48/2390814_results.db (UUID=0000432f-d869-7869-b721-6ccc28da5fdd)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:57.743879 128527832072000 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014603 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:57.744955 128527832072000 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001046 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:57.747355 128527832072000 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002356 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:57.752010 128527832072000 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.002864 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:57.846247 128527832072000 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.094201 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:57.848966 128527832072000 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002689 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:57.849027 128527832072000 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:57.865033 128527832072000 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015992 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:57.865065 128527832072000 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:57.865077 128527832072000 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:57.865089 128527832072000 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:57.865318 128527832072000 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000215 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:57.865884 128527832072000 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.206379 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:57.872088 128527832072000 simple_timer.cpp:55] [rocprofv3] output generation ::     0.249496 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:18:57.872187 128527832072000 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.251966 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_0_1/MI100/out/pmc_1/2390814_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 9/12][Approximate profiling time left: 8 seconds]...
[profiling] Current input file: tests/workloads/dispatch_0_1/MI100/perfmon/pmc_perf_SQC_ICACHE_INFLIGHT_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:00.160264 135201743650624 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.302724 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:00.169998 135201743650624 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:00.380812 135201743650624 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:00.511947 135201743650624 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.341950 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:00.551265 135201743650624 generateRocpd.cpp:582] writing SQL database for process 2390825 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:19:00.552551 135201743650624 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_0_1/MI100/out/pmc_1/dl385-20-mi100-3c48/2390825_results.db (UUID=0000432f-e3ba-73ba-8945-d96ff73f37e5)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:00.642030 135201743650624 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014566 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:00.643143 135201743650624 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001082 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:00.645618 135201743650624 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002447 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:00.650628 135201743650624 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003054 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:00.743877 135201743650624 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.093221 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:00.746730 135201743650624 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002823 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:00.746759 135201743650624 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:00.762746 135201743650624 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015972 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:00.762773 135201743650624 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:00.762785 135201743650624 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:00.762797 135201743650624 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:00.763013 135201743650624 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000198 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:00.763460 135201743650624 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.212196 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:00.769353 135201743650624 simple_timer.cpp:55] [rocprofv3] output generation ::     0.254988 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:00.769438 135201743650624 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.257427 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_0_1/MI100/out/pmc_1/2390825_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 10/12][Approximate profiling time left: 5 seconds]...
[profiling] Current input file: tests/workloads/dispatch_0_1/MI100/perfmon/pmc_perf_SQ_IFETCH_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:03.121372 128470479159104 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.309922 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:03.131106 128470479159104 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:03.342911 128470479159104 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:03.475217 128470479159104 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.344111 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:03.514559 128470479159104 generateRocpd.cpp:582] writing SQL database for process 2390835 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:19:03.515774 128470479159104 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_0_1/MI100/out/pmc_1/dl385-20-mi100-3c48/2390835_results.db (UUID=0000432f-ef44-7f44-a521-0cd0903f58fe)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:03.606382 128470479159104 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.015132 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:03.607467 128470479159104 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001054 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:03.609987 128470479159104 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002492 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:03.615100 128470479159104 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003148 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:03.749791 128470479159104 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.134662 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:03.752634 128470479159104 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002812 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:03.752662 128470479159104 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:03.768923 128470479159104 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.016246 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:03.768950 128470479159104 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:03.768962 128470479159104 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:03.768983 128470479159104 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:03.769181 128470479159104 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000185 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:03.769649 128470479159104 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.255091 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:03.775446 128470479159104 simple_timer.cpp:55] [rocprofv3] output generation ::     0.297822 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:03.775538 128470479159104 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.300270 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_0_1/MI100/out/pmc_1/2390835_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 11/12][Approximate profiling time left: 2 seconds]...
[profiling] Current input file: tests/workloads/dispatch_0_1/MI100/perfmon/pmc_perf_SQ_INST_LEVEL_LDS_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:06.117081 133496394104640 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.309237 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:06.126806 133496394104640 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:06.336440 133496394104640 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:06.466589 133496394104640 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.339783 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:06.505585 133496394104640 generateRocpd.cpp:582] writing SQL database for process 2390845 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:19:06.506797 133496394104640 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_0_1/MI100/out/pmc_1/dl385-20-mi100-3c48/2390845_results.db (UUID=0000432f-faf8-7af8-90e4-6cb73254ad0d)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:06.597052 133496394104640 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.015168 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:06.598203 133496394104640 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001120 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:06.600762 133496394104640 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002531 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:06.605887 133496394104640 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003159 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:06.730857 133496394104640 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.124942 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:06.733732 133496394104640 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002844 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:06.733761 133496394104640 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:06.749741 133496394104640 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015966 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:06.749769 133496394104640 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:06.749781 133496394104640 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:06.749793 133496394104640 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:06.750013 133496394104640 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000199 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:06.750497 133496394104640 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.244912 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:06.756493 133496394104640 simple_timer.cpp:55] [rocprofv3] output generation ::     0.287444 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:06.756582 133496394104640 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.289941 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_0_1/MI100/out/pmc_1/2390845_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 12/12][Approximate profiling time left: 0 seconds]...
[profiling] Current input file: tests/workloads/dispatch_0_1/MI100/perfmon/pmc_perf_SQ_LEVEL_WAVES_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:09.060118 131646133460800 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.305781 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:09.068775 131646133460800 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:09.280360 131646133460800 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 17:19:09.414427 131646133460800 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.345653 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:09.453668 131646133460800 generateRocpd.cpp:582] writing SQL database for process 2390855 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 17:19:09.454878 131646133460800 generateRocpd.cpp:605] Opened result file: tests/workloads/dispatch_0_1/MI100/out/pmc_1/dl385-20-mi100-3c48/2390855_results.db (UUID=00004330-067a-767a-82b1-9797ceeb565a)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:09.543578 131646133460800 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014231 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:09.544681 131646133460800 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001070 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:09.547129 131646133460800 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002420 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:09.552114 131646133460800 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003096 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:09.628679 131646133460800 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.076537 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:09.631517 131646133460800 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002808 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:09.631546 131646133460800 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:09.647298 131646133460800 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015738 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:09.647326 131646133460800 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:09.647338 131646133460800 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:09.647350 131646133460800 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:09.647561 131646133460800 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000191 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:09.648031 131646133460800 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.194364 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:09.653946 131646133460800 simple_timer.cpp:55] [rocprofv3] output generation ::     0.237124 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 17:19:09.654032 131646133460800 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.239554 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/dispatch_0_1/MI100/out/pmc_1/2390855_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
PC sampling data collection skipped as block 21 is not specified.
[roofline] Skipping roofline
