Rocprofiler-Compute version: 3.7.0
Profiler choice: rocprofiler-sdk
Output directory: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/sort_dispatches/MI200
Target: MI210
Command: ./tests/vcopy -n 1048576 -b 256 -i 3
Kernel Selection: None
Dispatch Selection: None
Filtered sections: All

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Collecting Performance Counters
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Generating native tool project using command: cmake -S /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib -B /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
-- Checking for module 'libdw'
--   Package 'libdw', required by 'virtual:world', not found
-- Could NOT find libdw (missing: libdw_LIBRARY libdw_INCLUDE_DIR)
-- {fmt} version: 12.1.0
-- Build type:
-- Configuring done (0.1s)
-- Generating done (0.0s)
-- Build files have been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
Building native tool using command: cmake --build /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build --parallel
[  0%] Built target gsl_assert
[ 33%] Built target fmt
[100%] Built target rocprofiler-compute-tool
Searching /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src by lib/_build/lib/librocprofiler-compute-tool.so for native collector
Using native collector: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
Using native counter collection tool: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
[profiling] Iteration multiplexing: Disabled
[Run 1/13][Approximate profiling time left: pending first measurement...]
[profiling] Current input file: tests/workloads/sort_dispatches/MI200/perfmon/pmc_perf_0.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:16:41.572496 128557332913984 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.190314 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:16:41.573184 128557332913984 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:41.764824 128557332913984 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:16:41.855639 128557332913984 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.282455 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:41.878606 128557332913984 generateRocpd.cpp:583] writing SQL database for process 2528027 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:16:41.879401 128557332913984 generateRocpd.cpp:606] Opened result file: tests/workloads/sort_dispatches/MI200/out/pmc_1/smc4124-25-mi210-3c48/2528027_results.db (UUID=0001fa84-5b03-7b03-a0e5-74b4d18e0a29)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:41.962745 128557332913984 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007984 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:41.963954 128557332913984 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001194 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:41.965735 128557332913984 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001766 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:41.976368 128557332913984 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008490 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:42.339361 128557332913984 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.362977 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:42.342295 128557332913984 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002900 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:42.342313 128557332913984 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:42.352157 128557332913984 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.009837 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:42.352173 128557332913984 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:42.352179 128557332913984 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:42.352185 128557332913984 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:42.352292 128557332913984 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000097 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:42.352488 128557332913984 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.473882 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:42.356534 128557332913984 simple_timer.cpp:55] [rocprofv3] output generation ::     0.498870 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:42.356638 128557332913984 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.500958 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/sort_dispatches/MI200/out/pmc_1/2528027_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 2/13][Approximate profiling time left: 25 seconds]...
[profiling] Current input file: tests/workloads/sort_dispatches/MI200/perfmon/pmc_perf_1.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:16:43.878796 131324158103360 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.190370 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:16:43.879403 131324158103360 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:44.073180 131324158103360 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:16:44.161708 131324158103360 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.282305 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:44.184239 131324158103360 generateRocpd.cpp:583] writing SQL database for process 2528035 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:16:44.185023 131324158103360 generateRocpd.cpp:606] Opened result file: tests/workloads/sort_dispatches/MI200/out/pmc_1/smc4124-25-mi210-3c48/2528035_results.db (UUID=0001fa84-6405-7405-8828-cf519e92ebfd)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:44.267763 131324158103360 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008006 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:44.268945 131324158103360 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001165 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:44.270536 131324158103360 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001576 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:44.280796 131324158103360 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008270 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:44.601831 131324158103360 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.321019 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:44.604284 131324158103360 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002435 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:44.604301 131324158103360 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:44.613256 131324158103360 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008948 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:44.613270 131324158103360 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:44.613276 131324158103360 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:44.613283 131324158103360 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:44.613433 131324158103360 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000119 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:44.613685 131324158103360 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.429447 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:44.616964 131324158103360 simple_timer.cpp:55] [rocprofv3] output generation ::     0.453645 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:44.617069 131324158103360 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.455321 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/sort_dispatches/MI200/out/pmc_1/2528035_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 3/13][Approximate profiling time left: 22 seconds]...
[profiling] Current input file: tests/workloads/sort_dispatches/MI200/perfmon/pmc_perf_2.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:16:46.148800 127320304475968 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.191295 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:16:46.149451 127320304475968 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:46.344420 127320304475968 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:16:46.438475 127320304475968 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.289025 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:46.461048 127320304475968 generateRocpd.cpp:583] writing SQL database for process 2528043 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:16:46.461852 127320304475968 generateRocpd.cpp:606] Opened result file: tests/workloads/sort_dispatches/MI200/out/pmc_1/smc4124-25-mi210-3c48/2528043_results.db (UUID=0001fa84-6ce2-7ce2-9212-a70988bde255)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:46.544627 127320304475968 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008046 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:46.545824 127320304475968 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001181 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:46.547561 127320304475968 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001722 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:46.558262 127320304475968 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008553 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:46.859096 127320304475968 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.300819 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:46.861423 127320304475968 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002307 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:46.861441 127320304475968 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:46.870089 127320304475968 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008639 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:46.870104 127320304475968 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:46.870110 127320304475968 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:46.870117 127320304475968 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:46.870224 127320304475968 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000100 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:46.870439 127320304475968 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.409391 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:46.873701 127320304475968 simple_timer.cpp:55] [rocprofv3] output generation ::     0.433466 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:46.873810 127320304475968 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.435285 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/sort_dispatches/MI200/out/pmc_1/2528043_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 4/13][Approximate profiling time left: 20 seconds]...
[profiling] Current input file: tests/workloads/sort_dispatches/MI200/perfmon/pmc_perf_3.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:16:48.419211 136023756070720 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.190393 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:16:48.419826 136023756070720 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:48.614001 136023756070720 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:16:48.698669 136023756070720 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.278843 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:48.721618 136023756070720 generateRocpd.cpp:583] writing SQL database for process 2528051 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:16:48.722429 136023756070720 generateRocpd.cpp:606] Opened result file: tests/workloads/sort_dispatches/MI200/out/pmc_1/smc4124-25-mi210-3c48/2528051_results.db (UUID=0001fa84-75c1-75c1-b780-7e228ea2ba47)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:48.805843 136023756070720 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008022 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:48.807039 136023756070720 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001179 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:48.809166 136023756070720 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002113 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:48.819607 136023756070720 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008351 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:49.106472 136023756070720 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.286850 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:49.108896 136023756070720 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002408 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:49.108914 136023756070720 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:49.117688 136023756070720 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008767 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:49.117703 136023756070720 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:49.117709 136023756070720 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:49.117716 136023756070720 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:49.117868 136023756070720 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000116 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:49.118120 136023756070720 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.396502 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:49.121417 136023756070720 simple_timer.cpp:55] [rocprofv3] output generation ::     0.420717 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:49.121537 136023756070720 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.422814 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/sort_dispatches/MI200/out/pmc_1/2528051_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 5/13][Approximate profiling time left: 18 seconds]...
[profiling] Current input file: tests/workloads/sort_dispatches/MI200/perfmon/pmc_perf_4.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:16:50.662122 132842395115328 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.188611 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:16:50.662708 132842395115328 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:50.857714 132842395115328 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:16:50.939375 132842395115328 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.276667 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:50.961955 132842395115328 generateRocpd.cpp:583] writing SQL database for process 2528060 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:16:50.962761 132842395115328 generateRocpd.cpp:606] Opened result file: tests/workloads/sort_dispatches/MI200/out/pmc_1/smc4124-25-mi210-3c48/2528060_results.db (UUID=0001fa84-7e86-7e86-abd5-3dfd44b074a3)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:51.044877 132842395115328 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007934 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:51.045993 132842395115328 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001099 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:51.047580 132842395115328 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001572 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:51.057903 132842395115328 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008333 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:51.336043 132842395115328 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.278121 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:51.338380 132842395115328 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002319 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:51.338398 132842395115328 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:51.347178 132842395115328 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008773 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:51.347192 132842395115328 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:51.347198 132842395115328 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:51.347205 132842395115328 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:51.347336 132842395115328 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000123 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:51.347575 132842395115328 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.385621 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:51.350690 132842395115328 simple_timer.cpp:55] [rocprofv3] output generation ::     0.409490 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:51.350792 132842395115328 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.411368 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/sort_dispatches/MI200/out/pmc_1/2528060_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 6/13][Approximate profiling time left: 15 seconds]...
[profiling] Current input file: tests/workloads/sort_dispatches/MI200/perfmon/pmc_perf_5.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:16:52.863807 130360606547776 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.183616 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:16:52.864402 130360606547776 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:53.058660 130360606547776 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:16:53.145550 130360606547776 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.281149 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:53.167708 130360606547776 generateRocpd.cpp:583] writing SQL database for process 2528068 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:16:53.168516 130360606547776 generateRocpd.cpp:606] Opened result file: tests/workloads/sort_dispatches/MI200/out/pmc_1/smc4124-25-mi210-3c48/2528068_results.db (UUID=0001fa84-8725-7725-9340-029bcc5caf0d)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:53.250626 130360606547776 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007805 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:53.251821 130360606547776 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001179 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:53.253513 130360606547776 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001677 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:53.264306 130360606547776 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008634 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:53.272916 130360606547776 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.008594 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:53.275040 130360606547776 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002110 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:53.275058 130360606547776 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:53.283560 130360606547776 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008495 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:53.283575 130360606547776 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:53.283581 130360606547776 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:53.283587 130360606547776 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:53.283707 130360606547776 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000090 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:53.283898 130360606547776 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.116190 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:53.286735 130360606547776 simple_timer.cpp:55] [rocprofv3] output generation ::     0.140053 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:53.286784 130360606547776 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.141185 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/sort_dispatches/MI200/out/pmc_1/2528068_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 7/13][Approximate profiling time left: 13 seconds]...
[profiling] Current input file: tests/workloads/sort_dispatches/MI200/perfmon/pmc_perf_SQC_DCACHE_INFLIGHT_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:16:54.792761 125161570262848 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.187733 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:16:54.793374 125161570262848 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:54.985201 125161570262848 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:16:55.069427 125161570262848 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.276053 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:55.091500 125161570262848 generateRocpd.cpp:583] writing SQL database for process 2528077 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:16:55.092302 125161570262848 generateRocpd.cpp:606] Opened result file: tests/workloads/sort_dispatches/MI200/out/pmc_1/smc4124-25-mi210-3c48/2528077_results.db (UUID=0001fa84-8ea9-7ea9-a6dd-ddc5cdae6161)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:55.175120 125161570262848 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008124 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:55.176213 125161570262848 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001074 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:55.178168 125161570262848 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001941 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:55.188692 125161570262848 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008546 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:55.597958 125161570262848 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.409251 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:55.600061 125161570262848 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002080 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:55.600078 125161570262848 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:55.609362 125161570262848 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.009276 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:55.609377 125161570262848 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:55.609384 125161570262848 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:55.609390 125161570262848 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:55.609524 125161570262848 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000127 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:55.609787 125161570262848 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.518287 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:55.612838 125161570262848 simple_timer.cpp:55] [rocprofv3] output generation ::     0.542047 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:55.612958 125161570262848 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.543481 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/sort_dispatches/MI200/out/pmc_1/2528077_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 8/13][Approximate profiling time left: 11 seconds]...
[profiling] Current input file: tests/workloads/sort_dispatches/MI200/perfmon/pmc_perf_SQC_ICACHE_INFLIGHT_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:16:57.141126 136357587169088 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.189417 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:16:57.141714 136357587169088 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:57.335607 136357587169088 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:16:57.421422 136357587169088 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.279708 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:57.444130 136357587169088 generateRocpd.cpp:583] writing SQL database for process 2528085 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:16:57.444895 136357587169088 generateRocpd.cpp:606] Opened result file: tests/workloads/sort_dispatches/MI200/out/pmc_1/smc4124-25-mi210-3c48/2528085_results.db (UUID=0001fa84-97d4-77d4-9d2f-22c5254c3a8d)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:57.526966 136357587169088 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008160 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:57.528149 136357587169088 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001167 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:57.530060 136357587169088 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001896 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:57.540346 136357587169088 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008304 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:57.940977 136357587169088 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.400616 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:57.943294 136357587169088 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002297 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:57.943311 136357587169088 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:57.951789 136357587169088 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008471 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:57.951803 136357587169088 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:57.951810 136357587169088 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:57.951816 136357587169088 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:57.951972 136357587169088 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000122 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:57.952249 136357587169088 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.508119 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:57.955272 136357587169088 simple_timer.cpp:55] [rocprofv3] output generation ::     0.532061 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:57.955394 136357587169088 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.533931 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/sort_dispatches/MI200/out/pmc_1/2528085_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 9/13][Approximate profiling time left: 8 seconds]...
[profiling] Current input file: tests/workloads/sort_dispatches/MI200/perfmon/pmc_perf_SQ_IFETCH_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:16:59.521785 131012133871424 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.196706 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:16:59.522401 131012133871424 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:59.715887 131012133871424 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:16:59.800310 131012133871424 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.277910 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:59.822182 131012133871424 generateRocpd.cpp:583] writing SQL database for process 2528093 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:16:59.822943 131012133871424 generateRocpd.cpp:606] Opened result file: tests/workloads/sort_dispatches/MI200/out/pmc_1/smc4124-25-mi210-3c48/2528093_results.db (UUID=0001fa84-a11a-711a-af15-656aed4fc097)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:59.905960 131012133871424 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008292 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:59.907158 131012133871424 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001181 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:59.909239 131012133871424 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002066 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:16:59.919762 131012133871424 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008490 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:00.509890 131012133871424 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.590114 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:00.512140 131012133871424 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002230 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:00.512159 131012133871424 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:00.520537 131012133871424 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008371 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:00.520551 131012133871424 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:00.520557 131012133871424 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:00.520564 131012133871424 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:00.520695 131012133871424 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000101 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:00.520916 131012133871424 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.698734 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:00.523915 131012133871424 simple_timer.cpp:55] [rocprofv3] output generation ::     0.722466 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:00.524050 131012133871424 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.723699 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/sort_dispatches/MI200/out/pmc_1/2528093_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 10/13][Approximate profiling time left: 6 seconds]...
[profiling] Current input file: tests/workloads/sort_dispatches/MI200/perfmon/pmc_perf_SQ_INST_LEVEL_LDS_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:17:02.083829 135331866230592 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.194870 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:17:02.084469 135331866230592 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:02.277469 135331866230592 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:17:02.359605 135331866230592 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.275137 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:02.382786 135331866230592 generateRocpd.cpp:583] writing SQL database for process 2528104 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:17:02.383580 135331866230592 generateRocpd.cpp:606] Opened result file: tests/workloads/sort_dispatches/MI200/out/pmc_1/smc4124-25-mi210-3c48/2528104_results.db (UUID=0001fa84-ab1d-7b1d-a1ac-b2bd8e1bf6f0)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:02.464825 135331866230592 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007985 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:02.465924 135331866230592 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001077 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:02.467857 135331866230592 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001918 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:02.478061 135331866230592 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008244 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:02.820979 135331866230592 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.342903 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:02.823020 135331866230592 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002022 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:02.823047 135331866230592 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:02.831529 135331866230592 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008444 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:02.831544 135331866230592 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:02.831550 135331866230592 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:02.831556 135331866230592 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:02.831666 135331866230592 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000103 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:02.831878 135331866230592 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.449092 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:02.834914 135331866230592 simple_timer.cpp:55] [rocprofv3] output generation ::     0.472942 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:02.835025 135331866230592 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.475379 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/sort_dispatches/MI200/out/pmc_1/2528104_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 11/13][Approximate profiling time left: 4 seconds]...
[profiling] Current input file: tests/workloads/sort_dispatches/MI200/perfmon/pmc_perf_SQ_INST_LEVEL_SMEM_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:17:04.373977 128236822871872 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.190727 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:17:04.374607 128236822871872 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:04.567746 128236822871872 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:17:04.655412 128236822871872 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.280805 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:04.678183 128236822871872 generateRocpd.cpp:583] writing SQL database for process 2528113 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:17:04.678989 128236822871872 generateRocpd.cpp:606] Opened result file: tests/workloads/sort_dispatches/MI200/out/pmc_1/smc4124-25-mi210-3c48/2528113_results.db (UUID=0001fa84-b414-7414-b3f4-147877618287)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:04.762055 128236822871872 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008104 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:04.763274 128236822871872 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001203 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:04.765352 128236822871872 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002063 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:04.775891 128236822871872 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008492 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:05.107487 128236822871872 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.331581 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:05.109855 128236822871872 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002343 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:05.109873 128236822871872 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:05.119690 128236822871872 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.009810 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:05.119705 128236822871872 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:05.119712 128236822871872 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:05.119718 128236822871872 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:05.119852 128236822871872 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000101 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:05.120049 128236822871872 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.441867 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:05.123105 128236822871872 simple_timer.cpp:55] [rocprofv3] output generation ::     0.465836 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:05.123206 128236822871872 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.467746 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/sort_dispatches/MI200/out/pmc_1/2528113_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 12/13][Approximate profiling time left: 2 seconds]...
[profiling] Current input file: tests/workloads/sort_dispatches/MI200/perfmon/pmc_perf_SQ_INST_LEVEL_VMEM_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:17:06.676913 126495950065472 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.196520 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:17:06.677507 126495950065472 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:06.878144 126495950065472 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:17:06.965137 126495950065472 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.287630 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:06.987415 126495950065472 generateRocpd.cpp:583] writing SQL database for process 2528121 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:17:06.988239 126495950065472 generateRocpd.cpp:606] Opened result file: tests/workloads/sort_dispatches/MI200/out/pmc_1/smc4124-25-mi210-3c48/2528121_results.db (UUID=0001fa84-bd0d-7d0d-a3ed-24c69319cb52)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:07.071287 126495950065472 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008170 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:07.072460 126495950065472 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001156 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:07.074429 126495950065472 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001954 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:07.084747 126495950065472 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008337 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:07.601926 126495950065472 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.517166 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:07.604269 126495950065472 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002319 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:07.604286 126495950065472 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:07.612778 126495950065472 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008485 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:07.612793 126495950065472 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:07.612799 126495950065472 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:07.612805 126495950065472 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:07.612907 126495950065472 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000095 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:07.613136 126495950065472 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.625722 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:07.616155 126495950065472 simple_timer.cpp:55] [rocprofv3] output generation ::     0.649532 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:07.616268 126495950065472 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.651082 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/sort_dispatches/MI200/out/pmc_1/2528121_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 13/13][Approximate profiling time left: 0 seconds]...
[profiling] Current input file: tests/workloads/sort_dispatches/MI200/perfmon/pmc_perf_SQ_LEVEL_WAVES_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 19:17:09.152952 139311324716864 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.190404 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 19:17:09.153564 139311324716864 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:09.347926 139311324716864 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 19:17:09.431627 139311324716864 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.278063 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:09.454252 139311324716864 generateRocpd.cpp:583] writing SQL database for process 2528129 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 19:17:09.455053 139311324716864 generateRocpd.cpp:606] Opened result file: tests/workloads/sort_dispatches/MI200/out/pmc_1/smc4124-25-mi210-3c48/2528129_results.db (UUID=0001fa84-c6bf-76bf-9b0c-059cf68b296f)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:09.537459 139311324716864 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008057 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:09.538638 139311324716864 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001163 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:09.540202 139311324716864 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001550 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:09.550713 139311324716864 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008560 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:09.868045 139311324716864 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.317318 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:09.870290 139311324716864 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002222 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:09.870308 139311324716864 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:09.879531 139311324716864 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.009216 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:09.879546 139311324716864 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:09.879553 139311324716864 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:09.879559 139311324716864 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:09.879670 139311324716864 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000101 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:09.879904 139311324716864 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.425653 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:09.882921 139311324716864 simple_timer.cpp:55] [rocprofv3] output generation ::     0.449555 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 19:17:09.883014 139311324716864 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.451339 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/sort_dispatches/MI200/out/pmc_1/2528129_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
PC sampling data collection skipped as block 21 is not specified.
[roofline] Checking for roofline.csv in tests/workloads/sort_dispatches/MI200
[roofline] Benchmark execution failed: 'L1'. Skipping roofline.
