Rocprofiler-Compute version: 3.7.0
Profiler choice: rocprofiler-sdk
Output directory: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/join_type_kernel/MI200
Target: MI210
Command: ./tests/vcopy -n 1048576 -b 256 -i 3
Kernel Selection: None
Dispatch Selection: None
Filtered sections: All

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Collecting Performance Counters
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Generating native tool project using command: cmake -S /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib -B /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
-- Checking for module 'libdw'
--   Package 'libdw', required by 'virtual:world', not found
-- Could NOT find libdw (missing: libdw_LIBRARY libdw_INCLUDE_DIR)
-- {fmt} version: 12.1.0
-- Build type:
-- Configuring done (0.1s)
-- Generating done (0.0s)
-- Build files have been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
Building native tool using command: cmake --build /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build --parallel
[  0%] Built target gsl_assert
[ 33%] Built target fmt
[100%] Built target rocprofiler-compute-tool
Searching /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src by lib/_build/lib/librocprofiler-compute-tool.so for native collector
Using native collector: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
Using native counter collection tool: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
[profiling] Iteration multiplexing: Disabled
[Run 1/13][Approximate profiling time left: pending first measurement...]
[profiling] Current input file: tests/workloads/join_type_kernel/MI200/perfmon/pmc_perf_0.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:49:51.129175 137896005746496 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.189598 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:49:51.129735 137896005746496 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:51.325263 137896005746496 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:49:51.412173 137896005746496 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.282438 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:51.434700 137896005746496 generateRocpd.cpp:583] writing SQL database for process 2521228 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:49:51.435517 137896005746496 generateRocpd.cpp:606] Opened result file: tests/workloads/join_type_kernel/MI200/out/pmc_1/smc4124-25-mi210-3c48/2521228_results.db (UUID=0001fa6b-c838-7838-aa9f-49fd9eb44f63)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:51.518944 137896005746496 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007959 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:51.520171 137896005746496 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001211 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:51.521871 137896005746496 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001684 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:51.532357 137896005746496 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008332 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:51.856741 137896005746496 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.324368 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:51.859132 137896005746496 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002368 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:51.859150 137896005746496 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:51.867698 137896005746496 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008542 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:51.867713 137896005746496 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:51.867719 137896005746496 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:51.867725 137896005746496 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:51.867857 137896005746496 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000097 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:51.868079 137896005746496 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.433379 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:51.871074 137896005746496 simple_timer.cpp:55] [rocprofv3] output generation ::     0.457231 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:51.871177 137896005746496 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.458957 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/join_type_kernel/MI200/out/pmc_1/2521228_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 2/13][Approximate profiling time left: 25 seconds]...
[profiling] Current input file: tests/workloads/join_type_kernel/MI200/perfmon/pmc_perf_1.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:49:53.411574 138024311418688 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.191533 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:49:53.412215 138024311418688 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:53.604187 138024311418688 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:49:53.688268 138024311418688 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.276053 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:53.710504 138024311418688 generateRocpd.cpp:583] writing SQL database for process 2521239 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:49:53.711255 138024311418688 generateRocpd.cpp:606] Opened result file: tests/workloads/join_type_kernel/MI200/out/pmc_1/smc4124-25-mi210-3c48/2521239_results.db (UUID=0001fa6b-d120-7120-ac97-5672290ddf5e)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:53.791646 138024311418688 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007830 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:53.792807 138024311418688 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001144 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:53.794392 138024311418688 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001570 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:53.804430 138024311418688 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008091 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:54.119150 138024311418688 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.314706 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:54.121426 138024311418688 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002260 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:54.121444 138024311418688 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:54.129893 138024311418688 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008441 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:54.129907 138024311418688 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:54.129913 138024311418688 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:54.129920 138024311418688 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:54.130026 138024311418688 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000098 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:54.130244 138024311418688 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.419741 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:54.133167 138024311418688 simple_timer.cpp:55] [rocprofv3] output generation ::     0.443275 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:54.133254 138024311418688 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.444949 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/join_type_kernel/MI200/out/pmc_1/2521239_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 3/13][Approximate profiling time left: 22 seconds]...
[profiling] Current input file: tests/workloads/join_type_kernel/MI200/perfmon/pmc_perf_2.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:49:55.649747 132657129025344 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.191540 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:49:55.650352 132657129025344 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:55.842339 132657129025344 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:49:55.926238 132657129025344 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.275886 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:55.948782 132657129025344 generateRocpd.cpp:583] writing SQL database for process 2521249 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:49:55.949620 132657129025344 generateRocpd.cpp:606] Opened result file: tests/workloads/join_type_kernel/MI200/out/pmc_1/smc4124-25-mi210-3c48/2521249_results.db (UUID=0001fa6b-d9de-79de-8ab8-442fe06c930d)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:56.031363 132657129025344 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007997 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:56.032553 132657129025344 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001172 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:56.034234 132657129025344 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001666 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:56.044584 132657129025344 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008175 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:56.345272 132657129025344 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.300671 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:56.347576 132657129025344 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002286 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:56.347593 132657129025344 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:56.356080 132657129025344 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008480 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:56.356094 132657129025344 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:56.356100 132657129025344 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:56.356107 132657129025344 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:56.356229 132657129025344 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000115 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:56.356464 132657129025344 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.407682 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:56.359369 132657129025344 simple_timer.cpp:55] [rocprofv3] output generation ::     0.431449 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:56.359464 132657129025344 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.433179 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/join_type_kernel/MI200/out/pmc_1/2521249_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 4/13][Approximate profiling time left: 20 seconds]...
[profiling] Current input file: tests/workloads/join_type_kernel/MI200/perfmon/pmc_perf_3.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:49:57.885832 137125337853760 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.191223 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:49:57.886411 137125337853760 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:58.079678 137125337853760 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:49:58.168337 137125337853760 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.281926 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:58.190803 137125337853760 generateRocpd.cpp:583] writing SQL database for process 2521260 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:49:58.191595 137125337853760 generateRocpd.cpp:606] Opened result file: tests/workloads/join_type_kernel/MI200/out/pmc_1/smc4124-25-mi210-3c48/2521260_results.db (UUID=0001fa6b-e29b-729b-b14c-9d2d815eaab5)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:58.269805 137125337853760 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007882 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:58.270959 137125337853760 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001138 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:58.272994 137125337853760 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002020 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:58.283067 137125337853760 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008147 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:58.571253 137125337853760 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.288171 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:58.573395 137125337853760 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002126 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:58.573412 137125337853760 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:58.582431 137125337853760 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.009012 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:58.582446 137125337853760 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:58.582453 137125337853760 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:58.582460 137125337853760 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:58.582589 137125337853760 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000097 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:58.582804 137125337853760 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.392001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:58.585700 137125337853760 simple_timer.cpp:55] [rocprofv3] output generation ::     0.415746 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:49:58.585792 137125337853760 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.417413 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/join_type_kernel/MI200/out/pmc_1/2521260_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 5/13][Approximate profiling time left: 17 seconds]...
[profiling] Current input file: tests/workloads/join_type_kernel/MI200/perfmon/pmc_perf_4.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:50:00.112744 129766002999104 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.190760 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:50:00.113368 129766002999104 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:00.306770 129766002999104 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:50:00.393134 129766002999104 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.279766 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:00.415594 129766002999104 generateRocpd.cpp:583] writing SQL database for process 2521269 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:50:00.416408 129766002999104 generateRocpd.cpp:606] Opened result file: tests/workloads/join_type_kernel/MI200/out/pmc_1/smc4124-25-mi210-3c48/2521269_results.db (UUID=0001fa6b-eb4f-7b4f-aa2b-9b2dd75797bf)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:00.498808 129766002999104 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007996 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:00.499989 129766002999104 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001162 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:00.501694 129766002999104 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001690 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:00.512126 129766002999104 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008274 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:00.790404 129766002999104 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.278262 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:00.792812 129766002999104 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002379 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:00.792829 129766002999104 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:00.801493 129766002999104 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008657 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:00.801508 129766002999104 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:00.801514 129766002999104 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:00.801521 129766002999104 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:00.801622 129766002999104 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000094 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:00.801830 129766002999104 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.386237 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:00.804861 129766002999104 simple_timer.cpp:55] [rocprofv3] output generation ::     0.410006 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:00.804955 129766002999104 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.411774 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/join_type_kernel/MI200/out/pmc_1/2521269_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 6/13][Approximate profiling time left: 15 seconds]...
[profiling] Current input file: tests/workloads/join_type_kernel/MI200/perfmon/pmc_perf_5.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:50:02.307402 133488450953024 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.182054 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:50:02.307971 133488450953024 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:02.501148 133488450953024 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:50:02.587543 133488450953024 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.279571 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:02.609872 133488450953024 generateRocpd.cpp:583] writing SQL database for process 2521279 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:50:02.610658 133488450953024 generateRocpd.cpp:606] Opened result file: tests/workloads/join_type_kernel/MI200/out/pmc_1/smc4124-25-mi210-3c48/2521279_results.db (UUID=0001fa6b-f3ea-73ea-84cf-0a8379d920f0)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:02.692526 133488450953024 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007778 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:02.693745 133488450953024 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001203 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:02.695349 133488450953024 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001589 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:02.705673 133488450953024 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008327 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:02.714174 133488450953024 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.008486 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:02.716240 133488450953024 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002052 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:02.716258 133488450953024 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:02.724647 133488450953024 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008382 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:02.724662 133488450953024 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:02.724668 133488450953024 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:02.724674 133488450953024 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:02.724780 133488450953024 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000098 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:02.724974 133488450953024 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.115102 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:02.727817 133488450953024 simple_timer.cpp:55] [rocprofv3] output generation ::     0.138726 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:02.727867 133488450953024 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.140282 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/join_type_kernel/MI200/out/pmc_1/2521279_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 7/13][Approximate profiling time left: 13 seconds]...
[profiling] Current input file: tests/workloads/join_type_kernel/MI200/perfmon/pmc_perf_SQC_DCACHE_INFLIGHT_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:50:04.243880 139169918631744 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.190576 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:50:04.244489 139169918631744 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:04.438580 139169918631744 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:50:04.523923 139169918631744 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.279434 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:04.546529 139169918631744 generateRocpd.cpp:583] writing SQL database for process 2521289 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:50:04.547340 139169918631744 generateRocpd.cpp:606] Opened result file: tests/workloads/join_type_kernel/MI200/out/pmc_1/smc4124-25-mi210-3c48/2521289_results.db (UUID=0001fa6b-fb72-7b72-9c6e-1662c51fb412)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:04.628398 139169918631744 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008141 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:04.629604 139169918631744 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001189 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:04.631729 139169918631744 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002111 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:04.642411 139169918631744 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008536 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:05.050601 139169918631744 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.408174 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:05.052926 139169918631744 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002298 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:05.052944 139169918631744 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:05.061616 139169918631744 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008665 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:05.061631 139169918631744 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:05.061638 139169918631744 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:05.061644 139169918631744 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:05.061777 139169918631744 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000122 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:05.062003 139169918631744 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.515475 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:05.064933 139169918631744 simple_timer.cpp:55] [rocprofv3] output generation ::     0.539138 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:05.065057 139169918631744 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.541085 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/join_type_kernel/MI200/out/pmc_1/2521289_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 8/13][Approximate profiling time left: 11 seconds]...
[profiling] Current input file: tests/workloads/join_type_kernel/MI200/perfmon/pmc_perf_SQC_ICACHE_INFLIGHT_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:50:06.583669 128602698104640 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.187405 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:50:06.584232 128602698104640 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:06.777048 128602698104640 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:50:06.861346 128602698104640 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.277114 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:06.883751 128602698104640 generateRocpd.cpp:583] writing SQL database for process 2521301 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:50:06.884568 128602698104640 generateRocpd.cpp:606] Opened result file: tests/workloads/join_type_kernel/MI200/out/pmc_1/smc4124-25-mi210-3c48/2521301_results.db (UUID=0001fa6c-0499-7499-a7b0-7003c4a672f1)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:06.967416 128602698104640 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008032 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:06.968605 128602698104640 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001173 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:06.970534 128602698104640 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001914 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:06.980678 128602698104640 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008172 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:07.383783 128602698104640 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.403090 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:07.386182 128602698104640 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002376 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:07.386199 128602698104640 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:07.394689 128602698104640 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008483 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:07.394704 128602698104640 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:07.394710 128602698104640 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:07.394716 128602698104640 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:07.394826 128602698104640 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000102 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:07.395077 128602698104640 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.511327 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:07.397974 128602698104640 simple_timer.cpp:55] [rocprofv3] output generation ::     0.535015 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:07.398094 128602698104640 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.536700 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/join_type_kernel/MI200/out/pmc_1/2521301_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 9/13][Approximate profiling time left: 8 seconds]...
[profiling] Current input file: tests/workloads/join_type_kernel/MI200/perfmon/pmc_perf_SQ_IFETCH_LEVEL_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:50:08.981235 129226331406144 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.198439 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:50:08.981820 129226331406144 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:09.173879 129226331406144 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:50:09.262889 129226331406144 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.281070 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:09.285091 129226331406144 generateRocpd.cpp:583] writing SQL database for process 2521311 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:50:09.285886 129226331406144 generateRocpd.cpp:606] Opened result file: tests/workloads/join_type_kernel/MI200/out/pmc_1/smc4124-25-mi210-3c48/2521311_results.db (UUID=0001fa6c-0deb-7deb-af62-cf3e2efb50ce)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:09.367612 129226331406144 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008213 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:09.368801 129226331406144 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001171 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:09.370766 129226331406144 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001950 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:09.381160 129226331406144 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008413 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:09.964545 129226331406144 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.583370 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:09.966802 129226331406144 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002237 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:09.966819 129226331406144 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:09.975548 129226331406144 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008722 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:09.975563 129226331406144 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:09.975570 129226331406144 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:09.975576 129226331406144 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:09.975709 129226331406144 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000099 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:09.975935 129226331406144 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.690845 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:09.978867 129226331406144 simple_timer.cpp:55] [rocprofv3] output generation ::     0.714621 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:09.979005 129226331406144 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.716067 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/join_type_kernel/MI200/out/pmc_1/2521311_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 10/13][Approximate profiling time left: 6 seconds]...
[profiling] Current input file: tests/workloads/join_type_kernel/MI200/perfmon/pmc_perf_SQ_INST_LEVEL_LDS_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:50:11.508820 134091874123584 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.188966 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:50:11.509422 134091874123584 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:11.703842 134091874123584 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:50:11.793571 134091874123584 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.284149 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:11.816122 134091874123584 generateRocpd.cpp:583] writing SQL database for process 2521320 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:50:11.816909 134091874123584 generateRocpd.cpp:606] Opened result file: tests/workloads/join_type_kernel/MI200/out/pmc_1/smc4124-25-mi210-3c48/2521320_results.db (UUID=0001fa6c-17d4-77d4-a320-3205ece1a03f)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:11.899580 134091874123584 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008017 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:11.900774 134091874123584 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001179 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:11.902916 134091874123584 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002126 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:11.913428 134091874123584 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008321 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:12.256231 134091874123584 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.342787 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:12.259262 134091874123584 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.003008 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:12.259280 134091874123584 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:12.268042 134091874123584 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008756 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:12.268057 134091874123584 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:12.268063 134091874123584 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:12.268070 134091874123584 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:12.268173 134091874123584 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000093 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:12.268384 134091874123584 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.452262 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:12.271232 134091874123584 simple_timer.cpp:55] [rocprofv3] output generation ::     0.475860 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:12.271335 134091874123584 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.477714 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/join_type_kernel/MI200/out/pmc_1/2521320_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 11/13][Approximate profiling time left: 4 seconds]...
[profiling] Current input file: tests/workloads/join_type_kernel/MI200/perfmon/pmc_perf_SQ_INST_LEVEL_SMEM_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:50:13.799676 136426283204416 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.189074 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:50:13.800295 136426283204416 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:13.993263 136426283204416 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:50:14.074154 136426283204416 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.273860 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:14.096687 136426283204416 generateRocpd.cpp:583] writing SQL database for process 2521328 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:50:14.097472 136426283204416 generateRocpd.cpp:606] Opened result file: tests/workloads/join_type_kernel/MI200/out/pmc_1/smc4124-25-mi210-3c48/2521328_results.db (UUID=0001fa6c-20c7-70c7-a4b4-18bdf3670901)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:14.180769 136426283204416 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.007975 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:14.181958 136426283204416 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001171 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:14.183924 136426283204416 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001951 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:14.194185 136426283204416 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008288 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:14.525573 136426283204416 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.331374 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:14.527812 136426283204416 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002222 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:14.527829 136426283204416 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:14.536373 136426283204416 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008537 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:14.536389 136426283204416 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:14.536395 136426283204416 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:14.536402 136426283204416 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:14.536530 136426283204416 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000117 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:14.536781 136426283204416 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.440094 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:14.539721 136426283204416 simple_timer.cpp:55] [rocprofv3] output generation ::     0.463834 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:14.539826 136426283204416 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.465632 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/join_type_kernel/MI200/out/pmc_1/2521328_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 12/13][Approximate profiling time left: 2 seconds]...
[profiling] Current input file: tests/workloads/join_type_kernel/MI200/perfmon/pmc_perf_SQ_INST_LEVEL_VMEM_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:50:16.091335 126192804314944 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.198217 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:50:16.091909 126192804314944 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:16.285862 126192804314944 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:50:16.367129 126192804314944 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.275219 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:16.389683 126192804314944 generateRocpd.cpp:583] writing SQL database for process 2521337 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:50:16.390489 126192804314944 generateRocpd.cpp:606] Opened result file: tests/workloads/join_type_kernel/MI200/out/pmc_1/smc4124-25-mi210-3c48/2521337_results.db (UUID=0001fa6c-29b2-79b2-97a4-e49ac24a61b0)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:16.473544 126192804314944 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008201 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:16.474782 126192804314944 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001222 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:16.476799 126192804314944 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002003 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:16.486990 126192804314944 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008182 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:17.003462 126192804314944 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.516457 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:17.005800 126192804314944 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002316 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:17.005817 126192804314944 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:17.014170 126192804314944 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008345 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:17.014184 126192804314944 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:17.014190 126192804314944 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:17.014197 126192804314944 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:17.014304 126192804314944 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000099 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:17.014516 126192804314944 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.624833 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:17.017419 126192804314944 simple_timer.cpp:55] [rocprofv3] output generation ::     0.648399 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:17.017526 126192804314944 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.650349 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/join_type_kernel/MI200/out/pmc_1/2521337_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 13/13][Approximate profiling time left: 0 seconds]...
[profiling] Current input file: tests/workloads/join_type_kernel/MI200/perfmon/pmc_perf_SQ_LEVEL_WAVES_ACCUM.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 18:50:18.564711 134284223463232 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.189441 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 18:50:18.565294 134284223463232 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:18.757677 134284223463232 tool.cpp:2423] HSA version 8.21.0 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 18:50:18.842164 134284223463232 simple_timer.cpp:55] [rocprofv3] './tests/vcopy -n 1048576 -b 256 -i 3' ::     0.276871 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:18.864566 134284223463232 generateRocpd.cpp:583] writing SQL database for process 2521345 on node 2976770398
   |-> [rocprofiler-sdk] [m[0;31mE20260526 18:50:18.865386 134284223463232 generateRocpd.cpp:606] Opened result file: tests/workloads/join_type_kernel/MI200/out/pmc_1/smc4124-25-mi210-3c48/2521345_results.db (UUID=0001fa6c-3364-7364-84e5-ea32dc17a598)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:18.947850 134284223463232 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.008066 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:18.949058 134284223463232 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001192 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:18.950664 134284223463232 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.001591 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:18.960883 134284223463232 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.008239 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:19.278393 134284223463232 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.317495 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:19.280676 134284223463232 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002266 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:19.280694 134284223463232 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000004 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:19.289093 134284223463232 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.008392 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:19.289107 134284223463232 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:19.289113 134284223463232 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:19.289120 134284223463232 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:19.289224 134284223463232 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000097 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:19.289453 134284223463232 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.424887 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:19.292355 134284223463232 simple_timer.cpp:55] [rocprofv3] output generation ::     0.448443 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 18:50:19.292455 134284223463232 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.450245 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: tests/workloads/join_type_kernel/MI200/out/pmc_1/2521345_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
PC sampling data collection skipped as block 21 is not specified.
[roofline] Checking for roofline.csv in tests/workloads/join_type_kernel/MI200
[roofline] Benchmark execution failed: 'L1'. Skipping roofline.
