alias: tatd, block id: 15
Rocprofiler-Compute version: 3.7.0
Profiler choice: rocprofiler-sdk
Output directory: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_TA/MI100
Target: MI100
Command: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3
Kernel Selection: None
Dispatch Selection: None
Filtered sections: ['tatd']

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Collecting Performance Counters
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Generating native tool project using command: cmake -S /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib -B /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
-- Checking for module 'libdw'
--   Package 'libdw', required by 'virtual:world', not found
-- Could NOT find libdw (missing: libdw_LIBRARY libdw_INCLUDE_DIR)
-- {fmt} version: 12.1.0
-- Build type:
-- Configuring done (0.2s)
-- Generating done (0.0s)
-- Build files have been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build
Building native tool using command: cmake --build /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build --parallel
[  0%] Built target gsl_assert
[ 33%] Built target fmt
[100%] Built target rocprofiler-compute-tool
Searching /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src by lib/_build/lib/librocprofiler-compute-tool.so for native collector
Using native collector: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
Using native counter collection tool: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/src/lib/_build/lib/librocprofiler-compute-tool.so
[profiling] Iteration multiplexing: Disabled
[Run 1/8][Approximate profiling time left: pending first measurement...]
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_TA/MI100/perfmon/pmc_perf_0.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:51:47.823126 132980454809408 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.298689 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:51:47.833080 132980454809408 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:48.044225 132980454809408 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:51:48.175599 132980454809408 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.342519 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:48.214747 132980454809408 generateRocpd.cpp:582] writing SQL database for process 2383355 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:51:48.216085 132980454809408 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_TA/MI100/out/pmc_1/dl385-20-mi100-3c48/2383355_results.db (UUID=00004316-fb6d-7b6d-b78c-979ae829d192)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:48.308593 132980454809408 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.013872 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:48.309726 132980454809408 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001102 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:48.311917 132980454809408 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002163 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:48.317104 132980454809408 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003185 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:48.327698 132980454809408 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.010566 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:48.330354 132980454809408 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002627 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:48.330383 132980454809408 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:48.345870 132980454809408 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015472 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:48.345897 132980454809408 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:48.345910 132980454809408 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:48.345922 132980454809408 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:48.346134 132980454809408 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000194 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:48.346485 132980454809408 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.131738 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:48.352222 132980454809408 simple_timer.cpp:55] [rocprofv3] output generation ::     0.174132 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:48.352296 132980454809408 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.176648 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_TA/MI100/out/pmc_1/2383355_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 2/8][Approximate profiling time left: 19 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_TA/MI100/perfmon/pmc_perf_1.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:51:50.595105 123811655794496 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.296231 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:51:50.605425 123811655794496 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:50.821802 123811655794496 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:51:50.950900 123811655794496 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.345475 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:50.989679 123811655794496 generateRocpd.cpp:582] writing SQL database for process 2383365 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:51:50.990968 123811655794496 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_TA/MI100/out/pmc_1/dl385-20-mi100-3c48/2383365_results.db (UUID=00004317-0643-7643-b0b6-e6d4b085fe7b)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:51.083399 123811655794496 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014034 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:51.084563 123811655794496 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001133 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:51.086762 123811655794496 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002170 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:51.092089 123811655794496 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003250 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:51.099617 123811655794496 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.007499 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:51.102091 123811655794496 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002446 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:51.102120 123811655794496 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:51.117839 123811655794496 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015704 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:51.117867 123811655794496 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:51.117879 123811655794496 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:51.117891 123811655794496 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:51.118108 123811655794496 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000200 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:51.118462 123811655794496 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.128783 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:51.124441 123811655794496 simple_timer.cpp:55] [rocprofv3] output generation ::     0.171002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:51.124514 123811655794496 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.173565 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_TA/MI100/out/pmc_1/2383365_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 3/8][Approximate profiling time left: 14 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_TA/MI100/perfmon/pmc_perf_2.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:51:53.370861 135360678944576 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.297178 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:51:53.381351 135360678944576 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:53.592262 135360678944576 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:51:53.724357 135360678944576 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.343007 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:53.763398 135360678944576 generateRocpd.cpp:582] writing SQL database for process 2383375 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:51:53.764667 135360678944576 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_TA/MI100/out/pmc_1/dl385-20-mi100-3c48/2383375_results.db (UUID=00004317-111a-711a-be3f-e8e9c0c90c60)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:53.858000 135360678944576 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014044 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:53.859148 135360678944576 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001117 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:53.861380 135360678944576 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002204 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:53.866525 135360678944576 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003171 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:53.874074 135360678944576 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.007519 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:53.876638 135360678944576 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002535 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:53.876667 135360678944576 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:53.892333 135360678944576 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015650 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:53.892362 135360678944576 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:53.892374 135360678944576 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:53.892386 135360678944576 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:53.892576 135360678944576 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000176 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:53.892935 135360678944576 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.129539 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:53.898981 135360678944576 simple_timer.cpp:55] [rocprofv3] output generation ::     0.172132 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:53.899054 135360678944576 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.174648 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_TA/MI100/out/pmc_1/2383375_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 4/8][Approximate profiling time left: 11 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_TA/MI100/perfmon/pmc_perf_3.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:51:56.160845 134885646155584 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.298800 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:51:56.170778 134885646155584 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:56.381999 134885646155584 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:51:56.511007 134885646155584 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.340230 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:56.550392 134885646155584 generateRocpd.cpp:582] writing SQL database for process 2383385 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:51:56.551673 134885646155584 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_TA/MI100/out/pmc_1/dl385-20-mi100-3c48/2383385_results.db (UUID=00004317-1bfe-7bfe-b8d1-c1ac0f79708c)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:56.642848 134885646155584 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.013843 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:56.644016 134885646155584 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001137 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:56.646220 134885646155584 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002175 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:56.651487 134885646155584 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003226 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:56.656454 134885646155584 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.004938 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:56.659088 134885646155584 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002606 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:56.659118 134885646155584 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000003 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:56.674848 134885646155584 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015716 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:56.674875 134885646155584 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:56.674887 134885646155584 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:56.674899 134885646155584 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:56.675109 134885646155584 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000194 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:56.675457 134885646155584 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.125067 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:56.681354 134885646155584 simple_timer.cpp:55] [rocprofv3] output generation ::     0.167886 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:56.681427 134885646155584 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.170367 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_TA/MI100/out/pmc_1/2383385_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 5/8][Approximate profiling time left: 8 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_TA/MI100/perfmon/pmc_perf_4.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:51:58.907325 127517092888384 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.299783 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:51:58.916793 127517092888384 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:59.127700 127517092888384 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:51:59.257438 127517092888384 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.340645 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:59.296196 127517092888384 generateRocpd.cpp:582] writing SQL database for process 2383396 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:51:59.297467 127517092888384 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_TA/MI100/out/pmc_1/dl385-20-mi100-3c48/2383396_results.db (UUID=00004317-26b8-76b8-b6be-6750d5a1e22a)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:59.388966 127517092888384 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.013751 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:59.390175 127517092888384 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001158 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:59.392357 127517092888384 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002154 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:59.397510 127517092888384 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003184 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:59.402041 127517092888384 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.004503 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:59.404485 127517092888384 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002414 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:59.404513 127517092888384 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:59.420352 127517092888384 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015825 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:59.420380 127517092888384 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:59.420392 127517092888384 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:59.420404 127517092888384 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:59.420616 127517092888384 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000193 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:59.420959 127517092888384 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.124764 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:59.426865 127517092888384 simple_timer.cpp:55] [rocprofv3] output generation ::     0.166968 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:51:59.426939 127517092888384 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.169453 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_TA/MI100/out/pmc_1/2383396_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 6/8][Approximate profiling time left: 5 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_TA/MI100/perfmon/pmc_perf_5.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:52:01.672502 127940824547136 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.298950 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:52:01.682400 127940824547136 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:01.892506 127940824547136 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:52:02.022129 127940824547136 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.339730 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:02.061076 127940824547136 generateRocpd.cpp:582] writing SQL database for process 2383406 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:52:02.062359 127940824547136 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_TA/MI100/out/pmc_1/dl385-20-mi100-3c48/2383406_results.db (UUID=00004317-3186-7186-b6d4-49ba5a2c84fa)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:02.152720 127940824547136 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.013911 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:02.153901 127940824547136 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001143 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:02.156105 127940824547136 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002176 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:02.161228 127940824547136 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003183 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:02.165735 127940824547136 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.004478 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:02.168241 127940824547136 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002477 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:02.168271 127940824547136 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000003 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:02.183795 127940824547136 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015510 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:02.183823 127940824547136 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:02.183835 127940824547136 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:02.183847 127940824547136 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:02.184071 127940824547136 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000204 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:02.184429 127940824547136 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.123355 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:02.190279 127940824547136 simple_timer.cpp:55] [rocprofv3] output generation ::     0.165667 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:02.190353 127940824547136 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.168172 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_TA/MI100/out/pmc_1/2383406_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 7/8][Approximate profiling time left: 2 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_TA/MI100/perfmon/pmc_perf_6.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:52:04.446526 126691681247040 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.299864 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:52:04.456247 126691681247040 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:04.667887 126691681247040 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:52:04.794006 126691681247040 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.337759 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:04.833722 126691681247040 generateRocpd.cpp:582] writing SQL database for process 2383417 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:52:04.835021 126691681247040 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_TA/MI100/out/pmc_1/dl385-20-mi100-3c48/2383417_results.db (UUID=00004317-3c5b-7c5b-b4b8-afa93096b137)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:04.927912 126691681247040 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.014039 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:04.929113 126691681247040 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001166 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:04.931358 126691681247040 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002212 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:04.936736 126691681247040 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003264 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:04.941239 126691681247040 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.004471 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:04.943751 126691681247040 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002479 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:04.943784 126691681247040 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000002 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:04.959000 126691681247040 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.015192 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:04.959033 126691681247040 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:04.959057 126691681247040 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:04.959076 126691681247040 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:04.959280 126691681247040 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000181 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:04.959638 126691681247040 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.125916 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:04.965594 126691681247040 simple_timer.cpp:55] [rocprofv3] output generation ::     0.168974 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:04.965666 126691681247040 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.171597 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_TA/MI100/out/pmc_1/2383417_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
[Run 8/8][Approximate profiling time left: 0 seconds]...
[profiling] Current input file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_TA/MI100/perfmon/pmc_perf_7.yaml
   |-> [rocprofiler-sdk] [rocprofiler-compute] [rocprofiler_configure] (priority=1) is using rocprofiler-sdk v1.1.0 (1.1.0)
   |-> [rocprofiler-sdk] [0;33mW20260526 16:52:07.195123 135353476669248 simple_timer.cpp:55] [rocprofv3] tool initialization ::     0.296575 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool init
   |-> [rocprofiler-sdk] [0;33mW20260526 16:52:07.204934 135353476669248 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:07.416700 135353476669248 tool.cpp:2422] HSA version 8.20.1 initialized (instance=0)
   |-> [rocprofiler-sdk] [mvcopy testing on GCD 0
   |-> [rocprofiler-sdk] Finished allocating vectors on the CPU
   |-> [rocprofiler-sdk] Finished allocating vectors on the GPU
   |-> [rocprofiler-sdk] Finished copying vectors to the GPU
   |-> [rocprofiler-sdk] sw thinks it moved 1.000000 KB per wave
   |-> [rocprofiler-sdk] Total threads: 1048576, Grid Size: 4096 block Size:256, Wavefronts:16384:
   |-> [rocprofiler-sdk] Launching the  kernel on the GPU
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished executing kernel
   |-> [rocprofiler-sdk] Finished copying the output vector from the GPU to the CPU
   |-> [rocprofiler-sdk] Releasing GPU memory
   |-> [rocprofiler-sdk] Releasing CPU memory
   |-> [rocprofiler-sdk] [0;33mW20260526 16:52:07.546417 135353476669248 simple_timer.cpp:55] [rocprofv3] '/home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/vcopy -n 1048576 -b 256 -i 3' ::     0.341484 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:07.585997 135353476669248 generateRocpd.cpp:582] writing SQL database for process 2383428 on node 2710291163
   |-> [rocprofiler-sdk] [m[0;31mE20260526 16:52:07.587282 135353476669248 generateRocpd.cpp:605] Opened result file: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_TA/MI100/out/pmc_1/dl385-20-mi100-3c48/2383428_results.db (UUID=00004317-471b-771b-979d-fd550f5ee961)
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:07.679294 135353476669248 simple_timer.cpp:55] SQLite3 generation :: rocpd_string             ::     0.013996 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:07.680455 135353476669248 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_node          ::     0.001130 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:07.682678 135353476669248 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_process       ::     0.002194 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:07.687910 135353476669248 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_agent         ::     0.003257 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:07.692438 135353476669248 simple_timer.cpp:55] SQLite3 generation :: rocpd_info_pmc           ::     0.004500 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:07.694958 135353476669248 simple_timer.cpp:55] SQLite3 generation :: rocpd kernel info        ::     0.002491 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:07.694998 135353476669248 simple_timer.cpp:55] SQLite3 generation :: rocpd_region             ::     0.000003 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:07.711509 135353476669248 simple_timer.cpp:55] SQLite3 generation :: rocpd_kernel_dispatch    ::     0.016490 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:07.711542 135353476669248 simple_timer.cpp:55] SQLite3 generation :: rocpd_pmc_event          ::     0.000000 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:07.711565 135353476669248 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_copy        ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:07.711585 135353476669248 simple_timer.cpp:55] SQLite3 generation :: rocpd_memory_allocate    ::     0.000001 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:07.711796 135353476669248 simple_timer.cpp:55] SQLite3 generation :: SQL indexing             ::     0.000191 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:07.712157 135353476669248 simple_timer.cpp:55] SQLite3 generation :: total                    ::     0.126160 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:07.717942 135353476669248 simple_timer.cpp:55] [rocprofv3] output generation ::     0.168809 sec
   |-> [rocprofiler-sdk] [m[0;33mW20260526 16:52:07.718039 135353476669248 simple_timer.cpp:55] [rocprofv3] tool finalization ::     0.171570 sec
   |-> [rocprofiler-sdk] [m[rocprofiler-compute] In tool fini
   |-> [rocprofiler-sdk] [rocprofiler-compute] [write_counters] Counter collection data has been written to: /home/xuchen/dev/rocm-systems/projects/rocprofiler-compute/tests/workloads/ipblocks_TA/MI100/out/pmc_1/2383428_native_counter_collection.csv
Intermediate results_*.csv generation from rocpd databases is deprecated and will be replaced with automatic .db file retention in a future release.
PC sampling data collection skipped as block 21 is not specified.
[roofline] Skipping roofline
